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Preface 


In today’s world of intense. information dissemination: and globalisation, the need of 
the hour is to bring out a well-structured and authentic textbook, especially in the field of 
computer science and technology. Our humble effort in this regard is to make a sincere value 
addition through this textbook. 

This book is the culmination of our fascination for the subject of computation and its 
applications. It has been designed for students of computer science at the postgraduate level 
and is logically conceived, self-contained, well-organised and user-friendly. 

The book contains an in-depth coverage of all the topics related to the theory of 
computation as mentioned in the syllabuses of B.E., M.C.A. and M.Sc. (Computer Science) 
of various universities. Sufficient amount of theoretical inputs supported by a number of 
illustrations are included for those who take deep interest in the subject. 

The book includes 15 chapters, segregated into three sections. Part one; ‘Fundamentals 
of Computation’, contains chapters 1 to 3. Chapter 1 is introductory in nature and gives 
a broad picture of the subject. Chapters 2 and 3 discuss Formal Language Theory and 
Regular Expressions and Languages. Part two; ‘Models and Principles of Computation’, 
comprises chapters 4 to 7, and discusses topics like Finite State Machine, Equivalent 
Automata, Optimisation/Minimisation of DFA, Finite Automata and Regular Expressions, 
Transducers et cetera in a systematic way. Finally, part three; ‘Advanced Models and 
Principles of Computation’ includes chapter 9 to 15, and addresses issues of Context- 
Free Grammar and Context-Free Languages, Simplification of Context-Free Grammar, 
Pushdown Automata, Pumping Lemma, Turing Machine, TM Extensions and Languages 
and Formal Languages/Grammar Hierarchy . Each chapter. is followed by exercises to 
provide a hands-on experience and promote problem solving abilities. Appendices at the 
end of the book discuss necessary mathematical and computational information. 

The Theory of Computation requires students to understand problems in a proper 
perspective. Our best advice is to remember the fact that almost all the key ideas are 
programming problems in disguise. The machines and languages, in spite of being different, 
arrive at essentially the same techniques as are used in the design and implantation of real 
applications. The synergy of a proper approach and technique will not only help in mastering 
the subject, but also in refining the style of computing. 

Thus, coupled with motivation, enthusiasm and penchant to learn the subject, along with 
regular practice and systematic review of earlier chapters every week, this textbook will 
enable every student to appreciate and understand the subject in its totality and also prepare 
him to meet challenges in the future course of his study in this field. 
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Fundamentals of Automata 


“Theory of computation begins with alphabets and flows as 
poetry in mathematical rhythms.” 


Introduction 


When we study computability, we study problems in an abstract sense. For 
example, 


a. Addition is the problem of returning a third number that is the sum of two given 
numbers. 

b. Travelling Salesman problem (TSP) is one in which a list of distances between some 
number of cities are given and the person is asked to find the shortest route so that 
he visits each city once and returns to the start. 

c. Halting Problem (HP) is one in which a program is given some appropriate input and 
it needs to be decided whether the program, when run on that input, loops forever or 
halts. 


Problems (b) and (c) are the two problems of computer interest. 

In both these problems, the statement of the problem does not give the actual values 
needed to provide the result but just tells what kind of objects they are. A set of actual values 
for a problem is called an instance of the problem (by this terminology, all homework 
problems done by school students are considered as instances of problems). 

In all these problems an instance is required that is the input and the relationship between 
the input and the output is to be understood. In order to solve these problems, there are 
certain things one should know. 


1. Can it be solved algorithmically — is there any definite procedure that solves any 
instance of the problem in a finite amount of time? In other words, is it computable? 
Not all problems are computable. 

(e.g. NP) 

2. How hard is it to solve? What kind of resources are needed and how much of these 
resources are required? Again, some problems are harder no doubt. 
(e.g. TSP- It has no efficient solution, but is not hard.) 
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To find a solution to a problem algorithmically, it is required to know whether an algorithm 
for solving it exists. No programming language is needed. Similarly, if a problem can be 
solved by a computing mechanism, one has to have an abstract model of computation that 
can treat it in a rigorous mathematical way. Thus we can start with an abstract model given 
in figure 1.1 which is a computer that receives some input (an instance of a problem), has 
some computing mechanism, and produces some output (the solution of that instance). The 
configuration of computing mechanism at a given point, in its processing, is called the 
internal state. 


| | 
Internal state | 
| 


Figure 1.1. Model-I 


The given model can be simplified further, by making some assumptions. 


1. Every instance of a problem has a solution. 
2. The solution to any instance of a problem is unique. 


3. Wecan list all instances of a problem, along with its possible solution, in a systematic 
way. 


When these assumptions are made in (a), it is observed that every instance of addition has a 
unique solution. Each instance is a pair of numbers and possible solution includes any third 
number. We can systematically list all instances along with all possible solutions by listing 
all triples starting with 0 and then all triples starting with 1 etc. Since there are infinitely 
many triples starting with zero, we would never get around to listing any starting with one. 
Suppose that we are only concerned with natural numbers (0,1...) and if we list all triples 
that sum to zero (that is just (0,0,0)) and then all triples that sum to one (that is {1,0,0}, 
{0,1,0}, {0,0,1}) etc., it is guaranteed that we will eventually list any given triple. 

To implement the assumptions that a solution is unique, we cannot consider solving 
problems algorithmically unless every instance has a solution. An algorithm must produce 
some answer for every instance. If there is no answer for some instance, then whatever 
answer it produces will necessarily be wrong. 

With these assumptions, the abstract model can be reduced as shown in figure 1.2. This 
machine takes an instance of a problem along with a possible solution as its input and lights 
one lamp (Y) if the solution is correct and the other (N), if it is not correct. 
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Internal state 


Figure 1.2. Model-II 


The model-I is called original model in which we are given an instance and must produce a 
solution as an algorithm for solving a problem. Model-I] is called an algorithm for checking 
a problem. 

Now let us start with the implementation of model-I and model-IL. If we have a program 
that checks whether a proposed solution to an instance of a problem is correct or not and 
another program that systematically generates every instance of the problem along with 
every possible solution, then the problem is solved. But how to develop a program or a 
subroutine, that checks the correctness of the solution to a problem under consideration, for 
a given instance under the assumptions made? 

This could be achieved if we decide to use a subroutine that can be called repeatedly and 
returns the next instance/solution in the list. We can simply put this in a loop that gets the 
next instance/solution and checks whether the instance listed matches with the instance to 
be solved. If it does, we use the checking program as a subroutine to tell us, whether the 
solution listed along with the instance is correct, otherwise we reiterate the loop. In order 
to do this, we must confirm two things: 


1. for any input, the program always produces a solution in a finite amount of time and 
then checks the solution 

2. the procedure produces a solution if the checking algorithm says that it is correct 
(Y). 


Let us assume that an instance/solution is encoded as strings of symbols. Our task is to 
distinguish those strings that encode the solved instances of the problem, among all strings 
drawn from the same set of symbols. Such a distinguished set of strings is, of course, a 
language in the formal sense. One approach then, to study computability in general, is to 
study how to define and recognise formal language. 

Now how to define a language and how to specify an algorithm to recognise them? For 
this, a finite language is defined by simply listing all of its strings, and then the list is ‘hard- 
coded’ into a recognition algorithm (or ‘hard-wired’ into machine). This falls into three 
broad categories: 
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1. Algebraic definition — In this languages are defined by expressions, specifying how 
they are built up from a finite set of simple languages, using various operations for 
combining languages. 

2. Grammars — These can be understood as algorithms for generating languages for 
listing all (and only) the strings in the language. 

3. Automata — These are simply abstract models of computers, specialised to recognise 
particular languages. 


Now let us concentrate on what can be stored in the internal state of model-II of computation, 
to implement it successfully. 


Internal state 


Figure 1.3. Model -II (Modified Version) 


The data storage must have 


1. the capability for reading the program, 
2. the method of accessing input should be something like 
a. INPUT() - a function that returns current input character. Generally its form is: 

I= INPUT () 
or 
If (INPUT () = ‘a’) Then ..... 
or 
PUSH (INPUT ()) 
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Assume that INPUT () returns a unique value EOF, if all of the input has been 
consumed. 
b. NEXT () - a function that bumps to the next position in the input. It returns True 
in case input is not EOF and False otherwise. 
3. The program returns Trve if input is a string in the language and False otherwise. 


Thus for the successful implementation of a problem — a language should be defined and it 
has to be then hard-coded, the algorithm must be designed to recognise the language for any 
input string supplied using INPUT () and it should return the solution for the given instance. 


1.1 Computation 


1.1.1 Definition 


Computation can be defined as finding a solution to a problem from the given inputs by 
means of an algorithm. 

The theory of computation, a subfield of computer science and mathematics, deals with 
finding solutions to problems from the given inputs. For thousands of years, computing was 
done with pen and paper, or chalk and slate, or mentally (sometimes with the aid of tables). 
The theory of computation began in the early twentieth century, before the invention of 
modern electronic computers. 

At that time, mathematicians were trying to find which maths problems could be solved 
by simple methods and which could not. The first step was to define what they meant by 
a ‘simple method’ for solving a problem. In other words, they needed a formal model of 
computation. 


1.1.2 Model of computation 


Model of computation means a formal, abstract definition of a computer. Using a model 
one can analyse more easily, the intrinsic execution time or memory space of an algorithm 
while ignoring many implementation issues. There are many models of computation which 
differ in computing power (i.e., some models can perform computations that are impossible. 
for other models) and the cost of various operations. 

The different computational models are: 


a. Serial models 
@ Turing machine 
m random access machine 
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primitive recursive 

cellular automaton 

finite state machine 

cell probe model 

pointer machine 

alternation 

alternating Turing machine 
nondeterministic Turing machine 
oracle Turing machine 
probabilistic Turing machine 
universal Turing machine 
quantum computation 


b. Parallel models 


@ multiprocessor model 
@ work-depth model 
@ parallel random-access machine 


1.1.3 Serial Models 


Turing machine: This is a model of computation consisting of a finite state machine 
controller, a read-write head and an unbounded sequential tape. Depending on the current 
state and symbol read on the tape, the machine can change its state and then move the 
head left or right. Unless otherwise specified, a Turing machine is deterministic. Turing 
machine stores characters on an infinitely long tape, with one square area being scanned by 
a read/write head at any given time. 


Random access machine: This is a model of computation whose memory consists of 
an unbounded sequence of registers, each of which may hold an integer. In this model, 
arithmetic operations are allowed to compute the address of a memory register. 


Primitive recursive: A total function which can be written, using only nested 
conditional (if-then-else) statements and fixed iteration (for) loops. These models use 
functions and function composition to operate on numbers. 


Cellular automaton: This is a two-dimensional organisation of simple, finite state 
machines whose next state depends on their own state and the states of their eight closest 
neighbours. In general the machines may be arranged in meshes of higher or lower 
dimensions, having larger neighbourhoods or arbitrarily complex processors. 
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Finite state machine (FSM): This is a model of computation consisting of a set of 
states, a start state, an input alphabet and a transition function that maps input symbols and 
current states to a next state. Computation begins in the start state with an input string. It 
changes to new states, depending on the transition function. There are many variants, for 
example: 


a. machine associates an output with each transition. (Mealy machine) 
b. machine associates an output wiht each state. (Moore machine) 


transitions conditioned on no input symbol (€-transition, or null), or more than one 
transition for a given symbol and state (nondeterministic finite state machine) or 


d. one or more states, designated as accepting states (recogniser), etc. 


Cell probe model: This is a model of computation where the cost of a computation is 
measured by the total number of memory accesses to a random access memory, with cell 
size of log n bits. All other computations are not counted and are considered to be free. 


Pointer machine: — This is a model of computation whose memory consists of an 
unbounded collection of registers or records, connected by pointers. Each register may 
contain an arbitrary amount of additional information. No arithmetic is needed to compute 
the address of a register. The only way to access a register is by following pointers. 


Alternation: This is a model of computation proposed by A. K. Chandra, L. 
Stockmeyere, and D. Kozen, which has two kinds of states-AND and OR. The definition of 
accepting computation is adjusted accordingly. 


Alternating Turing machine: This is a nondeterministic Turing machine having 
universal states, from which the machine accepts possible moves of only that state which 
lead to acceptance. 


Nondeterministic Turing machine: _ This is a Turing machine which has more than 
one next state for some combinations of contents of the current cell and current state. An 
input is accepted, if any move sequence leads to acceptance. 


Oracle Turing machine: This is a Turing machine with an extra oracle tape and three 
extra states q?,y, Qn. When the machine enters q7, control goes to state qy, if the oracle 
tape content is in the oracle set; otherwise control goes to state q,,. 


Probabilistic Turing machine: This is a Turing machine in which some transitions are 
random choices among finitely many alternatives. 
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Note-1: 

The typical, deterministic Turing machine (TM) can be seen as a probabilistic TM 
with not more than one alternative for each transition. A nondeterministic TM is a 
probabilistic TM, ignoring the probabilities. 


Universal Turing machine: This is a Turing machine that is capable of simulating any 
other Turing machine by encoding the latter. 


Quantum computation: Here the computation is based on quantum mechanical 
effects, such as superposition or entanglement, in addition to classical digital manipulations. 


1.1.4 Parallel Models 


Multiprocessor model: This is a model of parallel computation, based on a set of 
communicating sequential processors. 


Work-depth model: This is a model of parallel computation in which one keeps track 
of the total work and depth of computation, without worrying about how it maps onto a 
machine. 


Parallel random-access machine: _ This is a shared memory model of computation, 
where the processors typically execute the same instruction synchronously and access to 
any memory location occurs in unit time. 


Shared memory: Here all the processors have the same global image of (and access to) 
the whole memory. 


1.1.5 Abstraction and Abstract machine 


Abstraction means a processor design which is not intended to be implemented as hardware, 
but is the notional executor of a particular intermediate language (abstract machine language) 
used in a compiler or interpreter. 

Abstract machine — is a procedure for executing a set of instructions in some formal 
language, possibly by taking an input data and producing the output. Such abstract machines 
are not intended to be constructed as hardware but are used in thought experiments about 
computability. 

Examples: Finite State Machine, Turing Machine. 


8 


Downloaded from https://www.cambridge.org/core. University of Sussex Library, on 20 Apr 2018 at 09:32:38, subject to the Cambridge Core terms of use, available at 
https://www.cambridge.org/core/terms. https://doi.org/10.1017/UP09788175968363.002 


Fundamentals of Automata 


Note—2: 


m Abstract machine according to its generic meaning is a behavioural model of a 
computer. 

m Abstract machine or abstract computer is a model of computation in computational 
complexity theory, used to analyse the complexity of algorithms. An abstract 
computer usually postulates its design in terms of the set of allowed operations 
and the time and space complexity of atomic operations. The best known example 
is Turing machine. 

m Abstract machine, as a processor design, is not intended to be implemented as 
hardware, but is the notional executor of a particular intermediate language (abstract 
machine language) used in a compiler or interpreter. An abstract machine has an 
instruction set, a register set and a model of memory. It may provide instructions 
which are closer to the language being compiled than any other physical computer or 
it may be used to make the language implementation easier to port to other platforms. 


1.2 Finite State Machine 


Definitions to Automata 


An automaton (plural: automata) is a self-operating machine. There exist the following 
equivalent definitions to the concepts of automata. 


a. An automaton is a system which obtains, transforms, transmits and uses information 
to perform its functions without direct human participation. It is self-operational. 
(Lerner, 1972). 

b. An automaton is a machine that has an input tape, that can be put into any of the several 
states. Various symbols are written on the tape before execution. The automaton begins 
reading the symbols on the tape, from left to right. Upon reading a symbol from the 
tape, the machine (typically) changes its state and advances the tape. After reading the 
input, the machine halts. 

c. An automaton is a model of computation consisting of a set of states: a start state, an 
input alphabet and a transition function that maps input symbols and current states to 
a next state. Computation begins in the start state with an input string. It changes to 
new states depending on the transition function. 
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d. A finite state machine (FSM) or finite automaton is a machine, or a mathematical 
model of a machine, which can only reach a finite number of states and transitions. It 
is used in mathematical problem analysis. 

e. A general model of a machine is called a finite automaton; ‘finite’ because the number 
of states and the alphabet of input symbols is finite; automaton because the structure or 
the machine (as it is more commonly called) is deterministic, i.e., the change of state 
is completely governed by the input. 

f. Automata are abstract mathematical models of machines that perform computations 
on an input by moving through a series of states or configurations. If the computation 
of an automaton reaches an accepting configuration, it accepts that input. At each stage 
of computation, a transition function determines the next configuration on the basis of 
a finite portion of the present configuration. 

g. Finite automata are computing devices that accept/recognise regular languages and are 
used to model operations of many systems encountered in practice. Their operations 
can be simulated by a very simple computer programs. 


1.3. Examples of Finite State Machine 


1.3.1 Counting to Five 


Here is a FSM, describing a process of counting upto five. The transition to state 5 is a loop 
back to itself. 


OO-O-O8) 


Figure 1.4. Process of Counting 


1.3.2 Getting up in the morning 


In this case, more than one arrow emerges from a particular circle i.e. either transition could 
occur. Notice the two edges leaving the ‘Alarm Buzzing’ state. This tell us that each time 
when the alarm goes on in the morning, the person might choose to turn the alarm off and 
get up, or he might hit the snooze button and go back to sleep. 
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Hitting 
Snooze 
Button 


Alarm i Getting 
Buzzing Up 


Figure 1.5. Process of Getting Up 


1.3.3. A Playing board 


Several children’s games fit the following description—pieces are set up on a playing board; 
die are thrown (or a wheel is spun) and a number is generated at random. Based on the 
generated number, the pieces on the board are rearranged, as specified by the rules of the 
game. Then, another child throws or spins and rearranges the pieces again. There is no skill 
or choice involved - the entire game is based on the values of the random numbers. Consider 
all possible positions of the pieces on the board and call them states. We begin with the 
initial state of the starting positions of the pieces on the board. The game then changes 
from one state to another, based on the value of the random number. For each possible 
number, there is one and only one resulting state, given the input of the number and the 
prior state. This continues until one player wins and the game is over. This is called the 
final state. 


1.3.4 A simple computer system 


Consider a computer with an input device, a processor, some memory and an output 
device. To calculate 3 + 4, we write a simple list of instructions and feed them into 
the machine, one at a time (e.g., STORE 3 TO X; STORE 4 TO Y; LOAD X; ADD Y; 
WRITE TO OUTPUT). Each instruction is executed as it is read. If all goes well, the 
machine outputs ‘7’ and terminates the execution. This process is similar to the board 
game (chess). The state of the machine changes after each instruction is executed, and 
each state is completely determined by the prior state and the input instruction (thus this 
machine is defined as deterministic). No choice or skill is involved and no knowledge 
of. the state of the machine is needed. The machine simply starts at an initial state, 
changes from state to state based on the instruction and the prior state, and reaches 
the final state of writing ‘7’. In this simple computer example, the start state is the 
original state of the machine before the program execution and in the final state, ‘7’, 
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is written on the output device. The input alphabet is the set of strings representing the 
instructions. 


1.3.5 Traffic Light 


Below is a FSM for a traffic light. From the picture, we can deduce that all traffic lights start 
off by showing red, then make a transition to a green light and finally display a yellow light 
before returning to the red state. 

Notice that this diagram indicates that it is impossible to go directly from green to red; 
you must go through the yellow state only. 


Start 


Figure 1.6. Traffic Light Modeling 


Traffic modeling: Let us now draw a state machine to characterise a traffic light that 
might change into a flashing-red mode at some point, and then later on change back into the 
normal green—yellow-ted cycle. 


Solution: There are two ways to approach this problem: to add a state that indicates 
‘flashing red’, or to add a state that means ‘dark’, which then alternates with red to appear 
as flashing red. Here is a solution with a ‘flashing red’ state: 


Start 


Figure 1.7. Traffic Light Modeling with Flash State 


Here is a solution with a ‘dark’ state: 


Figure 1.8. Traffic Light Modeling with Dark State 
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They look amazingly similar! The difference is in the transitions that happen. In the first 
answer, the machine stays in the ‘Flash’ state all night; in the second answer, the machine 
alternates between ‘Red’ and ‘Dark’ overnight. In both cases, we’ve decided that flashing 
red should change into red in the morning. 


1.3.6 Vending machine 


Let us consider the operation of a soft drink vending machine which charges 15 cents 
for a can. Pretend that you are the machine. Initially you are waiting for a customer 
to come and put some coins, that is, you are in the waiting-for-customer state. Let us 
assume that only nickels and dimes are used (for simplicity). When a customer comes 
and puts in the first coin, say a dime, you are no longer in the waiting-for-customer 
state. You have received 10 cents and are waiting for more coins to come. So we might 
say that you are in the /0-cents state. If the customer puts in a nickel, then you have 
received 15 cents and you wait for the customer to select a soft drink. So you are in 
another state, say 15-cents state. When the customer selects a soft drink, you must give 
the customer a can of soft drink. After that you stay in that state, until another coin 
is put in, to start the process afresh or you may terminate the operation and start from 
the initial state again. The states and the transitions between them, for this vending 
machine, can be represented with the diagram below. In the figure 1.9, circles represent 


Figure 1.9. Vending Machine 
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states and arrows represent state transitions. Ds on arrows represent a dime and Ns, a 
nickel. It is assumed that the machine terminates its operation when it receives 15 cents or 
more. 

In this example, you, as a vending machine have gone through (transitions between) a 
number of states, responding to the inputs from the customer (coins in this case). A vending 
machine looked at this way is an example of finite automaton. 


1.4 Components of Finite State Automata 


A Finite State Automaton is usually described, as consisting of three components: 


m aControl Unit 
m aRead Unit, and 
@ an Input Tape (or input file) 


The nature and operations of these components are as follows: 


@ Input Tape. The input tape, sometimes called the data tape of the automaton, 
consists of a sequence of cells. Each of the cells beginning from the left-hand end 
of the tape, (which is considered as the start of the tape) contains one of the words 
of the string to be processed by the machine. The words themselves are usually 
identified as comprising input/data symbols, with the individual words treated as if 
they constituted single symbols. The identification of the medium, containing the 
input string as a tape, is a historical artifact. Now, it only serves to convey the notion, 
that the words of the input string are made available to the automaton one at a time 
in a sequence, from left to right (beginning with the first or the leftmost word). 

= Read Unit. The read unit of the automaton reads words from the cells of the input 
tape (beginning with the first word in the leftmost cell of the input tape) and provides 
the individual words, one at a time, to the control unit. 

w Control Unit. The control unit governs the operations of the automaton by 
performing a sequence of transitions between the internal states available to it. 
Beginning with its initial state, the control unit executes a transition as each word 
is provided to it by the read unit, where these transitions are determined by the 
transition function of the automaton. If, immediately after the last word of the 
input string has been read, the control unit moves into an accepting state, then the 
automaton is said to accept or recognise the string of words. Otherwise, if the control 
unit is not in an accepting state after the last word is read, the automaton rejects the 
string. 
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In other words a finite automaton can also be thought of as a device, which satisfies the 
following conditions: 


The tape has the left end and extends to the right, without an end. 

The tape is divided into squares, in each of which a symbol can be written, prior to 
the start of the operation of the automaton. 

The tape has a read-only head. 

The head is always at the leftmost square, at the beginning of the operation. 

The head moves to the right by one square, every time it reads a symbol. It never 
moves to the left. When it sees no symbol, it stops and the automaton terminates its 
operation. 

There is a finite control which determines the state of the automaton as well as 
controls the movement of the head. 


Temporary 
storage 


Finite Control 


Figure 1.10. Components of Finite State Automata 


1.5 Elements of Finite State System 


Finite state machines consist of 4 main elements: 
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w state transitions, which are movement from one state to another 

m rules or conditions which must be met to allow a state transition 

™@ input events, which are either externally or internally generated. These may possibly 
trigger the rules and lead to state transitions. 


A finite state machine must have an initial state which provides a starting point and a current 
state which remembers the product of the last state transition. Received input events act as 
triggers, which cause an evaluation of some kind of the rules that govern the transitions from 
the current state to other states. The best way to visualise a FSM is to think of it as a flow 
chart or a directed graph of states. There are more accurate abstract modelling techniques 
that can be used. 


A possible Finite State Machine implementation 


Input 
events 
maybe 


The FSM 
System 


generated 
internally 
and 

externally 


Outputs 
generated by the 
current state 


Inputs trigger 
rules 
State 


Transition 


Figure 1.11. A possible Finite State Machine Control System Implementation 


Thus, in a finite automata at any given time, the control unit is in some internal state and 
the read unit of the automaton reads words from the cells of the input tape. As the time 
advances, the internal state is determined by the transition function. The transition function 
gives the next state, in terms of: 


™@ current state 
@ current input symbol 
@ the information currently present in the temporary storage. 
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During the transition from one state to other, output may be produced, or the information in 
the temporary storage may be changed. This transition from one state to the next is called 
move. 


1.5.1 State 


A state is a complete set of properties, transmitted by an object to an observer via one or 
more channels. Any change in the nature or quantity of such properties in a state is detected 
by an observer and thus a transmission of information occurs. 


The following are some of the types of states: 
a. Start state: An initial state or condition of a finite state machine. 


b. Accepting state: Ifa finite state machine finishes an input string and is in an accepting 
state, the string is accepted or considered to be valid. 


c. Next state: The state immediately following the current state, defined by the transition 
function of a finite state machine and the input, is the next state. 


d. Universal state: _A state in an alternating Turing machine, from which the machine 
only accepts all the possible moves leading to acceptance, is called a ‘Universal state’. 


e. Existential state: A state in a nondeterministic Turing machine, from which the 
machine accepts any move that leads to acceptance, is the existential state. 


f. Dead/Trap state: — Anonfinal state of a finite state machine, whose transitions on every 
input symbol terminates on itself. 


1.5.2 Transition 


The following are the two equivalent definitions for the transition of a finite state system. 


a. Transition is the act of passing from one state/place to the next. 
b. A change from one place or state or subject or stage, to another. 
Transitions are represented in the following ways: 
m State diagram or Transition diagram 
m State transition table 
® Transition functions 
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1.5.3 State diagram 


A diagram consisting of circles to represent states and directed line segments to represent 
transitions between the states, is called a state diagram. The symbols used in state diagrams 
are enlisted in Table 1.1. 


NOTATION MEANING 


(Start of the process) 


Final / Accepting / Absorption State 
(End of process state) 


Transition 


Input 
(Guarded Transition) 


Output 


Table 1.1 State Diagram Symbols 


For a finite state machine, a state diagram is a directed graph where, 
a. each edge is a transition between two states 
m Fora DFA, NFA, and Moore machine, input is labelled on each edge. 


mw Fora Mealy machine, input and outputs are labelled on each edge. 
b. each vertex is a state 


m@ Fora Moore machine, output is signified for each state. 


EXAMPLE 1.5.1: (State diagram) 
a. S, and S2 are states and Sj is an accept state. Each edge is labelled with the input. 


AA 
So 


Figure 1.12. ear State Diagram 
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b. So, 51, and S2 are states. Each edge is labelled with “j/k”, where j is the input and k is 
the output. 


1/1 0/0 


(les, 


mo 
6 


Figure 1.13. Sample State Diagram with the Edge Label Form j/k 


1.5.4 State transition table 


A state transition table can be described, in general, as: 


@ tabular representation of transitions that take two arguments and return a value 
rows correspond to states and columns correspond to inputs, 

entries correspond to next states 

the start state is marked with an arrow (—). 

the accepting states are marked with a star, (+). 


Inputs | Present inputs 


States 


So 
Si 


Sn 
Table 1.2 State Transition Table Format 


For a finite state machine, a state transition table is a table describing the transition function, 
6, of a finite automaton. This function governs that state (or states, in the case of a non- 
deterministic finite automaton) to which the automaton will move, given an input to the 
machine. 
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EXAMPLE 1.5.2: (state transition table) 
An example of a state transition table of machine, M (figure 1.12) is given below: 


Inputs | Present inputs 
1 0 
Si S2 
S2 S} 


Table 1.3. State Transition Table of Machine M 


All the possible inputs to the machine are enumerated across the columns of the table. All 
the possible states are enumerated across the rows. From the state transition table (Table 
1.3), it is easy to see that if the machine is in Sq (the first row), and the next input is character 
1, the machine will stay in S. If a character 0 arrives, the machine will make a transition to 
S2, as denoted by the arrow from Sj to S2 in the diagram. 

For certain types of finite state machine like nondeterministic finite automaton (NFA), a 
new input may cause the machine to be in more than one state. This is denoted in a state 
transition table by a set of parentheses { }, with the list of all legal states in the brackets. An 
example is given below: 


Si 
S2 S2 | Si og 
| S3 S. | Si Si 


Table 1.4 State Transition Table 


Here, a nondeterministic machine in the state Sy reading an input 0 will cause it to be in 
two states at one instant, the states S2 and S3. The last column defines the legal transition 
of states for the special character, ¢. This special character allows the NFA to move to a 
different state when no input is provided. In state S3, the NFA may move to Sj, without 
consuming an input character. 


1.5.5 State diagram from transition table 


It is possible to draw a state diagram from the table. Simple steps are given below: 


Step-1 : Draw the circles to represent the states given. 
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Step-2: For each of the states, scan across the corresponding row and draw 
an arrow to the destination state(s). There can be multiple arrows for an 
input character, if the automaton is an NFA. 


Step-3: Designate a state as the start state. The start state is given in the 
formal definition of the automata. 
Step-4: Designate one or more states as the accept state. 


EXAMPLE 1.5.3: (Construction of transition diagram from transition table) 


Table 1.5 State Transition Table 


: 


1 1 


Figure 1.14. Sample State Transition Diagram Obtained from the State Transition Table 


Note-3: 


A finite automaton has 
a. a finite set of states, one of which is designated as the initial 
state or start state, and some (may be none) of which are 
designated as final states. 
an alphabet © of possible input symbols. 
c. a finite set of transitions that informs each state and each 
symbol of the input alphabet, about the next state. 
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1.6 Mathematical Representation of Finite State Machine 


1.6.1 Ordered Quintuple Specification of Finite Automata 


A finite-state automaton, A can be specified by the entries in the following ordered quintuple: 


A= (Q,,z,5,q0,F) 
where 


@ Qisa finite non-empty set of labels identifying the states of the machine, 

@ > isa finite vocabulary of input or data symbols 

m 36 is the transition function of the machine which maps from 6 : Q@x 2 > Q 
(Q x = into Q) 

@ qo € Q identifies the starting state of the machine and 

m FC Qisaset of accepting states or the final state. 


1.6.2. Transition Function 


The transition function of a automaton, which is also called its transition matrix or 
transition table, specifies the state into which the machine will move, on the basis of 
its current state and the word that is read. Thus, the transition function 6 can be represented 


as a set of ordered triples of the form 
(qi, W, qj) 


@ g; € Qisa label, identifying the current state of the automaton 
m w <4 is the word that is read with the automaton in the q; state and 


@ 4; € Qis the label of the new state into which the automaton shifts, on reading the 
word w. 


where 


Elements of the transition function can also be represented in a rewriting-rule format such 


as 
qw—>q or 4(qi,w) = q; |. 


In other words, there is an arc in the transition diagram from state q; to q; labeled as a, iff 
5(gi,W) = Qj 
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1.6.3. Language of the Automaton 


In general, a finite state machine accepts a string n = w1,w2,... wi, if there is a path in the 
transition diagram such that it 


@ begins at a start state 
@ ends at an accepting state 
@ has a sequence of labels wi, w2,... wi. 


For example, 1010 is a string accepted by the automata of Figure 1.14. 


Formally, a string consisting of the n input symbols w;w2...w,», is accepted by the 
automaton A if, for each symbol w; € & (where i = | to n), there is a transition 


(qi-1,Wi, qi) € 6 


with go € Q being the starting state of A and g, € F being an accepting state. Thus, there is 
a sequence of transitions such that, as the first symbol w, of the string is read, the transition 
(qo, W1,q1) shifts the machine out of its starting state, and as the last symbol w,, is read, the 
last transition (gn—1, Wn, Gn) in the sequence shifts the machine into an accepting state. The 
set of all strings of input symbols, accepted by an automaton A, is called the language of 
the automaton and is denoted L(A). 


1.6.4 Extending the Transition Function to Strings 


The extended transition function is a function that takes a state g and a string w and returns 
the state p, that is reached when the automaton starts from q and processes w. 
Suppose that w = xa, where a denotes the last symbol of w. Then 


5(q,W) = 5(5(q, x), a). 


1.6.5 Design of a finite state machine 


Consider the finite state machine with the following characteristics: 

Set of states: Q = {go0. 41. 92,93} 

Alphabet: where the input string comes from X = {a,b} 

Initial state: where the start head starts at first element go = {qo} 

Set of final state/accept state: If we stop there we accept the input F = {qo, q1, 92} 
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Transition function 5: rules how to move from state to state . The following are the transition 
tules: 


m from state go and with input a, goto state qo 
(40,4) = go 
w from state go and with input b, goto state q; 
5(go,b) = 41 
m from state q; and with input a, goto state qo 
5(q1,4) = 40 
m from state g; and with input b, goto state q2 
6(q41,5) = 92 
@ from state g2 and with input a, goto state qo 
5(q2,4) = qo 
@ from state g2 and with input b, goto state q3 
6(q2,b) = 43 
a from state q3 and with any input, goto state q3 
5(q3,4) = 43 or 5(q3,b) = 43 
Figure 1.15 and Table 1.6 describe the above transitions. 


State Transition Diagram: 


a 


b 


Start 


Figure 1.15. State Transition Diagram 
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Transition table: 


Present input 


Table 1.6 State Transition Table 


1.6.6 Demonstration of finite state machine for the given input string 


Consider FSM of figure 1.15 with the input string W = ababbabb. The initial configuration 
and state is shown in figure 1.16. The action of FSM for each character of input string is 
shown from step-1 viz. step-8. 


Initial state: 


Figure 1.16. Finite State Machine 
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Step- 1: FSM after reading the input symbol a. 
Step- 2: FSM after reading the input symbol b. 


Step- 3: FSM after reading the input symbol a. 
Step- 4: FSM after reading the input symbol b. 
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Step- 5: FSM after reading the input symbol b. 
Step- 6: FSM after reading the input symbol a. 


Step- 7: FSM after reading the input symbol b. 
Step- 8: FSM after reading the input symbol b. 


aM Go 
q;4—*  4q, a, ° q, 
‘ 
4, 


Figure 1.17. Demonstration of finite state machine (step 1 to 8) for the input string 
W = ababbabb. 
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1.6.7 Language acceptance 
Consider the FSM of figure 1.16 with the following input strings: 


i. W=abba 
We begin from state go with input as a, and remain in the same state. From the state 
qo With b as input, we go to state g;. From q; with b we go to q2 and finally from q2 
(with a again) we end up at go which is a final state. Since the end is a successful one, 
thus ‘abba’ is the word in the language defined by the finite automata. 

ii. W=abbba 
We begin from state go with input as a, and remain in the same state. The next is from 
the state go (with b as input) we go to state g,, from gq, with b we go to q2, from q2 
with b we go to q3, and finally from g3 (with a) we remain in g3. Since we do not end 
up in one of the final state {q¢o, 1, q2}, this is an unsuccessful end. 


1.6.8 Finite state machine implementation (Software) 
A finite state machine can be implemented (in software) with a state transition matrix. 


a. In some cases a sparse matrix is implemented with linked lists or 
b. ahuge switch-statement for detecting the internal state and then the individual switch 
statements for decoding the input symbol are used. 


In hardware, a FSM may be built from a programmable logic device, relays, or even a 
mechanical cam timer combined with other elements. 


We discuss FSM and its variants in detail from chapter 4 to chapter 8. 


1.7 Automata Classification 


An automaton is a system which obtains, transforms, transmits and uses information to 
perform its functions without direct human participation. Itis self-operational. The following 
is the classification of automata: 


1.7.1 Based on Modeling 


Probabilisticautomata: Inaprobabilistic automata there is a predetermined probability 
of each of the next states, given the current state and input. 


Non-probabilistic automata: In a non-probabilistic automata there is no predeter- 
mined probability of next state, given the current state and input. 
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1.7.2 Based on implementation 


Deterministic automata: In the case of deterministic automata, the output is uniquely 
determined by the input sequences, i.e., we get a definite output given any input. The 
behaviour of such automata can be accurately predicted, if the transfer operator is known 
and given in the form of a table of a logical function. Also, we need to know the initial state 
and the input sequence. 


Statistical automata: In statistical automata, random output sequences are generated, 
given any fixed input. The amount of randomness can be set by applying a probability of 
any output, given the systems current state and input sequence. 


Memoryless automata: Memoryless automata, a concept in automata theory, 
recognises only one input at a time and produces the output based on that input. The output 
is not influenced by any additional inputs which arrived. The reaction time of the automaton 
is constant for all input signals. The internal state of such an automaton is independent of 
any external action. 


Finite memory automata: Finite memory automata, a concept in automata theory, 
refers to the type of automata where the group of output signals generated at a given quantised 
time depends not only on the signals applied at the same moment but also on those which 
arrived earlier. These preceding external actions (or fragments of them) are recorded in the 
automaton by a variation of its internal state. The reaction of such an automaton is uniquely 
determined by the group of input signals that has arrived and by its internal state at a given 
time. These factors also determine the state into which the automaton goes. 


Infinite memory automata: __Itrefers to an abstract circuit of a logical automaton which, 
in principle, is suitable for realising any information-processing algorithm. Turing machine 
belongs to this class of automata. 


1.7.3 Based on processing 
w Acceptors and recognisers 


Acceptors are those which either accept the input or do not. Recognisers are those which 
either recognise the input or do not. The following automata fall in this category: 


a. Deterministic finite state machine 
b. Nondeterministic finite state machine 
c. Pushdown automata (PDA) 
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Deterministic finite state machine: A deterministic finite state machine or 
deterministic finite automaton (DFA) is a finite state machine where for each pair of state 
and input symbol there is a unique next state. In other words for each state there is at most 
one transition for each possible input. 


Nondeterministic finite state machine: In nondeterministic finite automata (NFA), 
there can be more than one transition from a given state, for a given possible input (there 
may be several possible next states). 


In nondeterministic Automata the next state depends not only on the current input event, 
but also on an arbitrary number of subsequent input events. Until these subsequent events 
occur, it is not possible to determine which state the machine is in. 


Pushdown automata: Pushdown Automaton (PDA), is a generalisation of a finite 
state automaton. Like a FSA, a PDA changes from state to state, reading symbols of 
the input. Unlike an FSA, transitions also update the stack either by popping symbols or 
pushing them. 

a Transducers 


Transducers generate output from the given input. The following automata are in this 
category: 


a. Mealy machine 
b. Moore machine 


Mealy machine: Mealy machine is a finite state machine, where the outputs are 
determined by the current state and the input. 


Moore machine: Moore machine is a finite state automaton, where the outputs are 
determined by the current state alone. 


1.7.4 Based on the input 


Tree automata: A tree automaton is a type of finite state machine. It deals with tree 
structures, rather than the strings, like more conventional finite state machines. 


Linear automata: Finite automata that operate on languages of finite words. 


Rabin/Buchi automata: Finite automata that operate on languages of infinite words. 
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1.7.5 Based on applications 


Cellular Automaton: The cellular automaton consists of a line of cells, each coloured 
either black or white. At every step, there is a definite rule that determines the colour of 
a given cell from the colour of its immediate left and right neighbours on the step before. 
These automata are used for making pretty pictures and animations. 


Geographic automata: A system that is used to simulate the behaviour and distribution 
of objects in space such as householders, pedestrians, vehicles, shops, roads, land parcels, 
sidewalks etc., with respect to their properties and their locations, is a geographic automata 
system. 

In developing geographic automata systems, the aim is to infuse spatial properties into 
automata tools and adopt object-based view of urban system in terms of their activities. 

The geographic automata can be of a fixed or non-fixed type. The fixed geographic 
automata represent the objects that do not change their location over time. In context of 
urban systems, these objects are road links, building footprints, parks etc. 

Non-fixed geographic automata symbolise the entities that change their location, over 
time, such as pedestrians, vehicles, household etc. 


1.8 Automata in Real World 


1.8.1 Advantages of FSM 


m Their simplicity makes it easy for inexperienced developers to implement them with 
no extra knowledge (low entry level) 

g@ Predictability (in deterministic FSM): given a set of inputs and a known current state, 
the state transition can be predicted, allowing easy testing. 

m@ Due to their simplicity, FSMs are quick to design, implement and execute. 

@ FSM isan old knowledge representation and system modelling technique. It has been 
around for a long time. It is well known, even as an artificial intelligence technique, 
with lots of examples to learn from. 

gm FSMsare relatively flexible. There are a number of ways to implement a FSM-based 
system in terms of topology and also, it is easy to incorporate many other techniques. 

m@ It enables easy transfer from a meaningful abstract representation to a coded 
implementation. 

@ Low processor overhead makes it well suited to domains, where execution time is 
shared between modules or subsystems. Only the code for the current state needs 
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to be executed. Perhaps, a small amount of logic to determine the current state is 
needed. 

w It enables easy determination of the ‘reachability’ of a state, when represented in an 
abstract form. It is immediately obvious, whether a state is achievable from another 
state or not and what is required to achieve the state etc. 


1.8.2 Disadvantages of FSM 


@ The predictable nature of deterministic finite state machines can be unwanted in some 
domains such as computer games (solution may be nondeterministic FSM). 

m Larger systems, implemented using a FSM, can be difficult to manage and maintain 
without a well thought out design. The state transitions can cause a fair degree of 
‘spaghetti- factor’, when trying to follow the line of execution. 

@ Unsuitable for all problem domains, FSM should only be used when a system’s 
behaviour can be decomposed into separate states with well-defined conditions for 
state transitions. This means that all states, transitions and conditions need to be 
known and well-defined. 

The conditions for state transitions are rigid. In other words, they are fixed (this can 
be over come by using a Fuzzy State Machine (FuSM)). 


1.8.3. Applications of FSM 


m FSMsare extensively used in the video game industry. The latest games, like Warcraft 
III, take advantage of complex FSM systems to control the AI. Chat dialogues, where 
the user is prompted with choices, can also be run using FSMs. 

@ Besides controlling bots, dialogue and environmental conditions within video games, 
FSMs also have a large role outside the video game industry. For example, cars, 
airplanes, and robotics (machinery) employ complex FSMs. Even websites have a 
FSM. Websites that offer menus for you to traverse other detailed sections of the 
website, act much like a FSM with transitions between states. 

@ The applications of FSMs are also found in language processing: parsing, 
morphological representations, spelling, information retrieval, tagging, stemming, 
image compressions, cryptography and designing of compilers. 

@ The cellular automata are used for making pretty pictures and animations. 

a A geographic automaton is used to simulate the behaviour and distribution of objects 
in space such as householders, pedestrians, vehicles, shops, roads, land parcels, side 
walks etc., with respect to their properties and their locations. 
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m FSMs applied to biological and biomedical problem solving, is the most significant 
improvement in the recent years. Fourier-transform nuclear magnetic resonance 
(NMR), NMR imaging (or tomography), x-ray tomography, x-ray diffraction, high 
performance liquid chromatography, differential scanning calorimetry and mass 
spectrometry are techniques, which employ FSMs. 


Define computation. Explain the different computational models. 
What is a Finite State Machine? Explain with examples. 
Explain, with a neat diagram, the components of finite state automata. 
Explain, using a diagram, the elements of Finite State System. 
What is a transition? Explain, with examples, the different symbols used in a state 
transition diagram. 
6. Give the general procedure for drawing a state diagram from transition table. 
7. Present the ordered quintuple specification of finite automata. 
8. Give the data structure for implementation of finite state machine. 
9 
10 


CN eR GON 


Classify automata based on processing of data. 

Define cellular and geographic automata. 
11. Explain the advantages and disadvantages of automata. 
12. Explain the uses of finite state machine. 


33 


Downloaded from https://www.cambridge.org/core. University of Sussex Library, on 20 Apr 2018 at 09:32:38, subject to the Cambridge Core terms of use, available at 
https://www.cambridge.org/core/terms. https://doi.org/10.1017/UPO09788175968363.002 


Formal Language Theory 


‘Mathematics is the language of science.’ 


Introduction 


A language is a system of signs used to communicate information to others. However, the 
language of computation is a combination of both english and mathematics. Fundamentally, 
acomputer is a symbol manipulator, in the sense, that it takes sequences of symbols as inputs 
and manipulates them as per the program specifications. These symbols are precise and 
unambiguous, unlike the language of humans. The first step in communicating a problem 
to a machine is the design of a proper language of computation and this is the fundamental 
object of computability. 
While discussing about languages, it is important to note two cases: 


a. During the evaluation of an input expression (e.g.: calculator), the language of 
arithmetic expression handles both the input and the output communication. 

b. In the case of web form, the language simply describes all the legitimate inputs to 
the field and the output is simply a binary value-Yes or No. 


Thus, the problems belonging to case (a), have both input and an output language, while, 
those belonging to case (b), have only an input language. It is of interest to note that, from 
the perspective of theory of computation, any type of problem can be expressed in terms of 
a language recognition. 

Since a language is a medium of communication, it should be given some meaning 
(i.e. its semantics). But much of the manipulation of symbols, strings and language could 
be done effectively without understanding their semantics and this is the subject of this 
section. In other words, the mathematical study of the theory of computation begins with 
the understanding of the mathematics of symbols and strings. 


2.1 Symbols, Alphabets and Strings 


2.1.1 Symbols 


A symbol is a single object and is an abstract entity that has no meaning by itself. It is often 
called uninterpreted. It can be a character (letters from various alphabets, digits and special 
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characters are the most commonly used symbols). Normally, characters from a typical 
keyboard are used as symbols. 


EXAMPLE 2.1.1: (Symbols) 
Greek Alphabet 


A |a@J| alpha | Nj v nu 
B/A| beta [E/E xi 

r [y | gamma! O/ o | omicron 
A|6!| delta | M|z pi 
E | € | epsilon | P | p rho 

| Z|¢ | zeta | Lj o| sigma 
H/7 | eta Tit tau 
© | 0] theta | Y | v | upsilon 
I |: iota | D| ¢ phi 
K | « | kappa | X | x chi 
/\ | A | lambda | Y | ¥ psi 
Mi} yu Q2| wm | omega 
F [Fk | 


Table 2.1. Greek alphabet symbols. 


| = | is identically equal | WV | for all/for every ¢ does not belong to 
to 
= | is approximately A | Finite difference or ~T 
increment 
<_| is much less than 4 | There exists A —B | A maps into B 
> | is much greater | <=> | is equivalent to fog is proportional to 
than 
Cc | Complex set => | implies JS ax | integral 
N | Natural set | ~ | Difference V__| Vector differential 
R | Real set | R Real part of a 
| complex no. 
Q | Rational set C | subset B imaginary part of a 
complex no. 
Z | Integer set C | subset equals I the product of the 
terms (pi) 
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as Z | is not a subset @ | Empty set 
Ellipsis Ca | subset not equals f'(@) | first. derivative of 1 
f(x) | 
€ | Belongs to f(x) | Second derivative 
of f(x) 


Table 2.2 Mathematical Symbols. 


2.1.2 Alphabets 


An alphabet is any finite, non-empty set of symbols/characters. It is denoted by &. While 
distinguishing two alphabets, the second one is usually denoted by I. 


EXAMPLE 2.1.2: (Sample Alphabets) 


a. X = {0, 1}, is the binary alphabet, consisting of the symbols 0 and 1. 
b. & = {a,b,--- ,z}, is the lowercase english alphabet. 
c. YY = {a,b,c} is an alphabet of three symbols. 
d. % = {A,B,--- ,Z} is the uppercase english alphabet. 
e. LX = {—,U,|} is an alphabet consisting of some symbols of set operations. 
f. & = {0,1,<>} is the binary alphabet with a delimeter. 
Note-1: 
a. When a symbol is used with more than one meaning, then the symbol 
is said to be overloaded. 


b. Each symbol of an alphabet may also be called a letter of the alphabet 
or simply a letter. 


2.1.3 Strings 


A string is a finite sequence of symbols chosen from some alphabet. In other words, a string 
is a finite sequence of symbols over an alphabet. It is usually denoted by ‘W’ or ‘S’. Quite 
often, the letters u, v, w, x, y and z are used to denote strings while the letters i, j,k, 1,m and 
n are used to denote the natural numbers. 

The length of a string is defined as the number of symbols in the string and is denoted 
by |x| for the string x. 

A null string is a string with no symbols. In other words, it is a string of length zero and 
is denoted by A or A or €. 
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EXAMPLE 2.1.3: (Strings/Words) 
a. IfX = {0,1}, then 01010, 1111, 11,00,01,10 etc. are some of the strings chosen from 
this alphabet. 
b. If X = {a,b}, then ab, abab, aabb etc. are the words chosen from this alphabet. 


c. If X = {a,b,c}, then aaabcc, b, ab, bc, abc, cc, etc. are the words chosen from this 
alphabet. 


d. applejuice, is a string over the alphabet X = {a,b,--- ,z}. 


2.2 Operations on Strings 


Basically, there are a number of operations on strings. In this section, some of them are 
discussed. 
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2.2.1 Concatenation of Strings 


Concatenation of strings is simply the ‘gluing’ of strings together and placing them adjacent 
to each other in order to form a new string. 


Mathematically, If 5; and S2 are two strings then the concatenation ‘S’ of S; and S> is 
given by 


S'= S$, 0 So 


= {xy |xeSi, ye So} 


EXAMPLE 2.2.1: (Concatenation of strings) 


If S; = para and S2 = graph, then S; o Sz = paragraph. 

If & = {a, b}, S; = ab, Sz = baa, then S = S; o $2 = abbaa 
Let x = 0100101 and y = 1111, then x oy = 01001011111 
abbao €= abba. Here | € | = 0 

Let z be any string then zo E=€ oz 


moan F PS 


Associativity: (x o y) oz = xo (yo z), where x, y, and z are three strings. 


Note-3: 
Let x and y be two strings over an alphabet &. Then z = x 0 y is the 
concatenation of x and y with the following properties: 
(i) {z| = lel + My 

(ii) z(i) = x(i) for | < i < |x| and 

(ili) z(1x| ++) = y() for| <j < lyl. 
It is emphasised that the leading part of z comes from x and the trailing 
part from y. 


2.2.1.1 Inductive/Recursive Definition for Concatenation of Strings: — This kind 
of inductive definition is very common in computer science, and is given by: 


Let x be any string over an alphabet &. Then 


x=e ; xetlaxtox | iEN 


where N = {0,1,2,---}, is the set of non-negative integers. 
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2.2.2 Kleene Closure 


The kleene star or kleene closure was introduced by Stephen Kleene. It is a unary operation, 
either on sets of strings or on sets of symbols or characters. The application of the kleene 
star to a set S is written as S*. It is widely used for regular expressions. 


1. If S is a set of strings, then S* is defined as the smallest super set of S that contains € 
and is closed under the string concatenation operation. 

2. If Sis a set of symbols or characters, then S* is the set of all strings over the symbols 
in S, including the empty string. 
Thus, 


S* =US' where ieN 
i>0 


i={e}USUS?US U... 


where s’ refers to the i string of S. 


EXAMPLE 2.2.2: (Kleene Closure) 


gw Kleene closure applied to the set of strings the: 
a. If S = {cc,d} then 
S* = {e€ or any word composed of factors of cc and d} 


= {é or all strings of c’s and d’s in which c occurs in pairs} 


=US',iEN 
where 
S° = {e} 
S' = {d} 
S* = {cc, dd} 
S? = {ccd, ddd} 
S* = {cccc, ccdd, ddcc, dddd} 
Thus 


S* = {e€,d,cc, dd, ccd, ddd, cccc, ccdd, ddcc, dddd, ....} 
b. If S = {aa, b} then S* = {aa, b}*, ie., 
St=uUS' , ieN 
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where 50 = {e}, 

Ss! = {b), 

S* = {aa, bb}, 

S$? = {aab, bbb, baa} 

S* = {aaaa, aabb, bbaa, bbbb} 
Thus 


S* = {e, b, aa, bb, aab, bbb, baa, aaaa, aabb, bbaa, bbbb, . . .} 


c. Let S = {a, ba}, then S* = {a, ba}", ie., 
S*=US'’ , iEeN 


where 

S° = {e}, 

S' = {a}, 

Sa {aa, ba}, 

Sa {aaa, aba, baa} 

St= {aaaa, aaba, baaa, baba} 
Thus 


S* = {€, a, aa, ba, aaa, aba, baa, aaaa, aaba, baaa, baba, . . .} 
m Kleene closure as applied to set of characters. Let S = {a,b,c}, then S* = 
{€, a, b,c, aa, ab, ac, ba, bb, bc, .. .} 
2.2.3 Positive Closure 


The closure property, extended to set of all words of any length, except the null string i.e. 
€, is the operation called positive closure. Thus, if S is a set of strings/words, then S* is 
called the positive closure of S with 


EXAMPLE 2.2.3: (Positive closure) 
a. If 5 = {a, ba}, then St = {a, ba}t, ie., 


St=US' , i=1,---,n 
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where 

S' = {a}, 

= {aa, ba}, 

8S= {aaa, aba, baa} 

S* = {aaaa, baba, baaa, aaba} 
Thus 


St= {a, aa, ba, aaa, aba, baa, aaaa, baba, baaa, aaba, . . .} 
b. Example to a string set S satisfying the given conditions 
i. S* = St 
Let S = {€, a} then S* = {€,a,aa,aaa,...} andS* = {€,a,aa,aaa,...} 
=> S* = St 
ii, S* 4 St 
Let S = {a} then S* = {€,a,aa,aaa,...}andS* = {€,a,aa,aaa,...} 
=> §*4zSt 
iii, S = S* 
Let § = {€,a,aa,aaa,...} then S* = {€,a, aa, aaa, ...} 
=>S=S* 
iv. §S #S* 
Let S = {a}, S* = {€,a,aa, aaa, ...} 
=>S4S* 
c. Example of a set of strings such that S* is finite. 
Let S = {e} then S* = {e} 
=> S* is finite. 


Stephen Cole Kleene (1909 - 1994) was born in Hartford, Connecticut. He 
graduated from Amherst College and 4 years later received his Ph.D. in mathematics 
from Princeton. 

Kleene was best known for finding the branch of mathematical logic known as 
recursion theory, together with Alonzo Church, Kurt Godel, Alan Turing, and others; 
and for inventing regular expressions. By providing methods of determining which 
problems are solvable, Kleene’s work led to the study of ascertaining functions which 
are computable. Among other things, Kleene Algebra, the Kleene star, Kleene’s recursion 
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theorem and the Kleene fixpoint theorem are named after him. He also contributed to 
mathematical intuitionism as founded by Luitzen Egbertus Brouwer. 

Kleene was awarded an honorary Doctor of science by Amherst College in 1970, 
the steele prize by the American Mathematical Society in 1983, and the National 
Medal of science in 1990. 


2.2.4 String Reversal 


Intuitively, when a string x is reversed i.e. spelt backwards, then x* is the resulting string 
which satisfies: 


we xi) =xn+1-—-i, for l<i<n 
@ |x®} = |x]. 


EXAMPLE 2.2.4: (String reversal) 


a. Ifx = mus, then x® = sum. 
Let x = bottle. Then, x® = elttob and (x* )* = bottle = x. 

c. Letx =klim and y = rettub. Then, (xoy)® = (xy)® = (klimrettub)*® = buttermilk = 
butter o milk = yRx®, 


2.2.5 Parts of Strings 


It is important to talk about the various parts of a string, for example, a leading part, a 
middle segment and a trailing part. These concepts are used at several occasions and are 
referred to as prefix, substring and suffix respectively. 


EXAMPLE 2.2.5: (Parts of Strings) 


a. If x is a string and x = yz for some z, then y is a prefix of x. On the other hand, if 
there exist strings y and u, such that x = yzu, then z is a substring of x. Further, if 
x = yz for some y, then z is a suffix of x. 

b. Let x = masterpiece. Then master is a prefix of x. This is because, if y = master, 
z = piece, then x = yz, fitting the definition of prefix. By analogous reasoning, it 
can be shown that piece is a suffix of masterpiece. 

c. Let x = blessings. Then in is a substring of x, since we can let y = bless and z = gs 
in the definition. 

d. Let x = roses. The prefixes of x are A, r, ro, ros, rose and roses. 
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e. Let x,y and z be three strings defined over the alphabet {e,n,r,s,t,w}, such that 
x = western, y = west and z = ern. Then, y is a prefix of x and z is a suffix of x but 
z is neither a prefix, substring, nor the suffix of y. 

f. Let x = tea. Then, by taking y = A, we observe that z = tea is a suffix of x. Thus, tea 
is a suffix of itself. 

g. Letx = 101 and y = x°. It is apparent that y contains the substring x three times. 
Also, x is a prefix of y and x is a suffix of y. Further, y contains two occurrences of 
the substring x?, viz., y(1) ... y(6) and y(4) ... y(9). This indicates that occurrences 
can overlap. 


Theorem I 
For any set S of strings, 
S* des (S*)* = se 
Proof: We know that every word in S** is made up of factors from S* and every factor 
of S* is made up of factors from S. Therefore, it is clear that, every word in S** is made 
up of factors from S. 
Letx € S™. 
Then x = x1 -+-x, for some x; € S* 


=> 5** c s* 


Similarly, it can be shown that 
Ss” 


From (1) and (2), it implies that 


S* _ s** 


Hence proved. 


2.3 Formal Languages 


Following are the equivalent definitions for the concept of formal language. 


1. A formal language (or simply language) is an abstraction for general characteristics of 
programming languages, that can be defined as a set of strings, all of which are chosen 
from some particular alphabet Z. 

2. A formal language is a set of finite-length words (i,e., character strings) drawn from 
some finite alphabet. 
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The languages are denoted by letter L, with or without a subscript. 
Examples (language notation) 


a. L(M) is a language defined by a machine M, that accepts a certain set of strings. 
b. L(G) is a language defined by a grammar G, that recognises a certain set of strings. 
c. L(r) is a language defined by a regular expression r. 


EXAMPLE 2.3.1: (Languages) 


a. Let & = {z} 
Then the language of all possible strings is given by 


Ly = {z,22,2zz,2222,...}, 
which can be written as, 
L; = {z"}, for n=1,2,3,...... 


Here L; does not contain the null string. 
It is to be noted that the concatenation of z” and z” is z+". 
b. If LZ; includes the null string, then we can write, 


L; = {z"|n = 0,1,2,---}. 
¢. The language of all possible strings/words of odd length is 
Ly = {z, 2zz,2zzzzz,:-+}, 


where & = {z}. 
This can be also be written as, 


Ly = {2*"*"|n = 0, 1,2,---}. 
d. The language of all possible words of even length is 
L3 = {€,zz,zzzz,---} over L = {z}. 
This can also be written as, 
L3 = {z*"In = 0,1,2,...}. 
e. The language of all possible words, where p is a prime number, is 
L4 = {z? | p is a prime natural number}. 


f. Ls = {0, 1,000, 111,00000, 11111, ...} is a language in which all the strings consist 
of only 0’s or only 1’s and have an odd length. 
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g. Language consisting of the null string is {€}. 
h. Null language : {} = @. 
i. The language of set of binary numbers, whose value is a prime number, is 
Lo = {10, 11, 1010, 111, 1011,---}. 
over the alphabet & = {0, 1}. 


Note—4: 
€ is in the language of strings of length two or less since it has length zero, but 
not in {0, 1}. It is emphasised that € is not in any alphabet. 


2.3.1 Formal Language Specification 
Formal language can be specified in a number of ways, like 


a. strings produced by some formal grammar (chomsky’s hierarchy) 

b. strings produced by a regular expression 

c. strings accepted by some automaton, such as, a Turing machine or a finite state 
automaton 

d. from a set of related YES/NO questions to those for which the answer is YES 
(Decision problems). 


2.3.2 Category of Formal Languages 
The following type of formal laguages will be discussed in the later chapters. 


m Recursive Language 

Recursively Enumerable Language 
Context-Sensitive Language 
Context-free Language 

Regular Language 

Indexed Language and 

A Deterministic Context-free Language 


2.4 Operations on Languages 


As discussed earlier, languages are just sets and hence the usual set operations viz., 
complement, union, intersection and symmetric difference can be directly applied 
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to generate new languages. Further, the fundamental laws of set theory, namely, 
De Morgan’s laws hold good. However, the other language operations such as length, 
subsetting and reversal are applicable depending on the fact that a language is a set of 
strings. We shall now discuss the different operations with illustrations: 


2.4.1 Union 


This is one of the simplest operations on two languages. As discussed earlier, languages 
are sets of strings, and hence the union of two languages L; and Lp» is the set L; U Lp. 
Mathematically, this is represented as 


x € L; Ul), if and only if x € Lj orx € L2 
where L and Lp are defined over the same alphabet. 
Note-5: 
Suppose L is defined on the alphabet I", and Lz on [2, then 


we can consider L; and L2 to be defined on a new alphabet 
lr =T, UL? and proceed. 


EXAMPLE 2.4.1: (Union) 


a. Let Ly = {0,01,011} and Lz = {e, 001, 15}. 
Then 


L; UL2 = {e, 0,01, 011, 15} 


b. Let Li = {0,000, 00000, - - -} and Lz = {€, 00, 0000, - - - }. 


Then 
L; UL = {0}* 
c. Let L; = {24}, 
Then 
7 
“at =p UL; Ul, UL3UL4U---ULly, 
i= 
= {€,2,22,...27} 
Also 


lo @) 
ae = {€,2,22,---} = {2}* 
i= 
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2.4.2 Intersection 


Suppose L; and Lz are two languages over an alphabet I’. Then, the intersection of L; and L2 
is denoted by L; ML and clearly, this is a set of strings. Mathematically, this is represented 


as, 
x €L; NL, if and only if, x € Ly; and x € Ly, 


where L; and L are defined over I. 
EXAMPLE 2.4.2: (Intersection) 


a. Let L; = {€,00, 0000, ---} and Lz = {e,0,0°,---}. 
Then 
L; NL2 = {x}#o(x) mod 6 = 0 and x € {0}*} 


b. Let ZL; = {e} and L) = ¢. Then Lj Ah=¢ 
c. Let L be any language, then LN L = 


2.4.3. Complementation 


Complementation is an operation performed on a single language. Suppose L is a language 
over an alphabet ©, then the complement of L, denoted by L or L’, is the language consisting 
of all those strings that are not in L over the alphabet. 

Mathematically, this is expressed as 


x €L if and only if x € &* — L. 


EXAMPLE 2.4.3: (Complementation) 
a. If X = {a,b} and L;{a, b, aa}, then the complement of L i.e., 
L = &* —L = {é,a,b, aa, bb, ab, ba, aaa, bbb, - --} — {a, b, aa} 
= {€, bb, ab, ba,....} 
b. Let L = {€,1,11,---} be a language over {0, 1}. Then, clearly L consists of all 


strings, containing atleast one 0. 
c. Let L = {0,00, 0000, ---} be a language over {0, 1}. Then 
L = {e,0, 1, 00,01, 10,11,---} —L 


d. LetL = {07,0°,0°} be a language over {0, 1}. Then, it is interesting to note that 
L contains different types of strings of which one type consists-of all strings Os 
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representing prime numbers greater than five. Also, any string containing a 1 belongs 
to L. 
Mathematically, we can write 


L = {0, 1}* — {0*,0°,0°} 
e. Let L = {x]#o(x) = #;(x)}. Then 
L = {xl#o(x) 4 #1(x)} 


In other words, L consists of strings in which the number of occurrences of 0 and 1 
are not equal. 7 
f. Let L be any language over &*, then Z* = LUL. 


2.4.4 Symmetric Difference 


The operation, symmetric difference is no doubt an important but less familiar operation. 
Suppose, LZ; and Ly are two languages defined over an alphabet &. Then, the symmetric 
difference of L; and Lz is denoted by L; ® Ly, and is defined as 


Ly ® Lz = (Ly UL2) — (1 NL). 


Accordingly, elements of L; © Lz are contained either in L; or Lz but not in both. 


EXAMPLE 2.4.4: (Symmetric difference) 


a. Let L be a language over an alphabet I’. Then clearly L 6 6 =L,L @L = ¢, 
L@® &* =LandL@L= x* 
b. Let ZL; = {00, 0000, ---} and Lz = {11,1111,---}. Then, 


L ® Lz = {00, 11, 0000, 1111,---} = Ly ULy. 
c. If © = {0,1} then DS? @ {A, 0,00, 101} is the language {1, 01, 10, 11, 101}, since 
u<* = {e,0, 1,00, 01, 10, 11}. 
2.4.5 Concatenation of languages 


Concatenation of two languages L, and L» is the language L) o Lp, each element of which 
is a string formed by gluing one string of L; with another string of Lz. Mathematically, this 


is expressed as: 
Li oly = Lily = {xoy|x Ee Ly and y € Ly} |. 


This is illustrated through the following examples: 
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EXAMPLE 2.4.5: (Concatenation) 


a. Let Ly = {bce, bec, cc} and Ly = {cc, ccc}. 
Then, 
Lj o Ly = {becc, beccc, beccccc, eccc, ccccc}. 


But the Cartesian product is given by 
Li xIln= {(be, cc), (be, ccc), (bec, cc), (bee, ccc), (ee, cc), (ec, ccc)}. 


It is observed that L1 o Ly 4 Ly x Lz. Also |L; x Ly| = 6 and |L; o L2| = 5. This is 
because, bcoccc = bccocc. Indeed, strings and ordered pairs are different structures. 
b. Let L; = {0,1}* and Lz = {0, 1}. Then, 


L; o Lz = LiL = {0, 1}* — {€}. 
c. Let L, = {z,zzz,---} and Lp = {zz, zzzz,---}. Then, 
L; o Lp = {222, 2zzzz,---}. 


d. For any language L,Log=gdoL=$. 
e. Let L; = {0}* and Ly = {1}*. Then L; 0 Ly = {x|x consists of zero or more 0s, 
followed by zero or more 1s}. 


Here, L’ is the notation analogous to the notation x!, that represents the i concatenation of 
strings. We now give the following recursive definition 


Do ={e} ; Lit =L'oL (see 2.2.1). 


2.4.6 Reversal of Languages 


The reversal of a language is an important operation that can be used in illustrating the 
working procedure of various machines. This operation is similar to the reversal of a string 
operation. Mathematically, this can be expressed as: 


LR = {wR |w € L}. 


In other words, each element of L¥ is obtained by reversing the corresponding string in L. 
This is illustrated through the following examples: 


EXAMPLE 2.4.6: (Reversal of strings) 
a. If L = {a,b1, a2b2, a3b3}, then 


LF = {by ay, ba, b3a3}. 
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b. If L = {0,011,0111}, then 
LR = {0, 110, 1110}. 
c. IfL={0lt!, i> 1), then 
LR = ("+1043 > 1}. 
d. IfL = {0, 1}*, then L¥ = (0, 1}*. 
e. If L = {123, 321,213, ---} over © = {1, 2,3}, then L¥ = (321, 123, 312,---}. 
2.4.7 Palindrome Languages 
A language called ‘Palindrome’ over X = {a, b} is defined as 
Palindrome = {¢, all strings W, such that WR = Wand We Y}. 
Equivalently, 


Palindrome = {€, a, b, aa, bb, aaa, aba, bab, bbb, aaaa, abba, - - - }. 


2.4.8 Length Subsetting of a language 


Suppose L is a language over a fixed alphabet ©. Then, for the sake of convenience or 
compactness, it may be required to specify the strings in L, of length less than or equal to a 
specific value or a fixed size. In that case, the notation 5*" for a fixed alphabet is certainly 
an abbreviation for (=. L=" denotes all strings in L, of length less than or equal 
ton. 


EXAMPLE 2.4.7: (Length subsetting) 
a. Let L = {1,11,111,1111,---} then 


L=4 = {1111} and L** = (1,11, 111, 1111}. 
b. Let Ly = {€, 1010, 0°} and Ly = {0, 1}<’. Then, 
Ly NL2 = {e, 1010}. 


c. Let L = {0,1}. Then L=° = ¢,L=! = LL! = L. Further L= = ¢; L<! = L for 
i€N,i>2. 
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2.4.9 Kleene Star/Closure of languages 


The kleene star of a language L is the language L*, consisting of all the strings produced 
by concatenating any finite number of strings (including zero) from L, together. In other 
words, kleene star denotes the set of all words/strings of any length (including the entity 
string) from given language. 

Mathematically, this is expressed as 


Clearly, L* contains the empty string €, and L* C &* where L is defined over Z. 
Equivalently, the Kleene star of a language L can be considered as the infinite union of 
the powers of L, i.e., 


L=DuLiuL?u.-.-= Vv Li, 


The following examples illustrate the above points. 


EXAMPLE 2.4.8: (Kleene Star) 


a. IfL = {€,z,2z,zzz,---} over & = {2}, then 
L* = {€,z,2z,2zz,---}, 


where L* = L9UL! UL* U.... with L° = {e},L! = {z}, L? = {zz},... 
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b. IfL = {€,0,1,00,01,....} over = = {0, 1}, then 
L° = {e}, 
L! = {0,1}, 
L? = {00,01, 10, 11}, 
L? = {000, 010, 001, 100, 110, 101, 111} 
Maur ur Uns: 
=> L* = {€,0, 1,00, 01, 10,...}. 


{0000}* = {€, 07,08, ..}. 

g* = {€}. 

. If L = {e}, then L* = {e}* = {e}. 

. Consider the language S*, where S = {ab, ba} with & = {a,b}. 


m2 BO 


e All words in S* that have seven or fewer letters are: 
€, ab, ba, abab, abba, baab, baba, ababab, ababba, abbaab, abbaba, 
baabab, baabba, babaab, bababa. 

e The shortest string in £* that is not in the language S* are a and b. 


2.4.10 De Morgan’s Laws 


De Morgan’s laws allow one to express the intersection of two languages over &, in terms 
of the two operations viz., union and complementation. These are the fundamental laws in 
set theory. 


Let L;, Zz and L3 be any three languages, then 


Ly (L2 U I3) = (L} —In)N (Li - L3) adavegeass (1) 
L, — (2 N13) = (1 —L2)U (Ly — £3) ents eneeee (2) 
Li NL2 = (Lj ULS)* eins, (3) 


An equivalent form of (1) is 
L, NLz = &* — ((E* — L}) U (2* — Ly). 
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2.5.1 Symbols, Alphabets and Strings 


12. 


13. 
14. 


Are the alphabets {c, d} and {C, B} the same? 

Give two sets that are not alphabets. 

Is the empty set ¢, an alphabet? 

How many occurrences of e are there in excellence? What are their positions? 
What is #;(x), if x = singing sisters? 

Let x = cdc. Write down x* and find its length. 

Find a string x that can be simultaneously be a prefix, a substring and a suffix of a 
string. 

Let x = ylf and z = rettub. Then show that (xz)* = z®xR, 

Let x = Coffee. Show that, Coffee, is a suffix of itself. 

Let © = {$, %, #}. Find =* 


. Let & be an alphabet. How many prefixes does the string x =a 123 +--+ X, have? 


(axe XL for 1 <1 <n) 

Prove the statement ‘the number of suffixes of a string is always equal to the number 
of prefixes.’ 

Give the recursive definition of string reversal. 


If x is a string over an alphabet © prove that (x® \* =x. 


2.5.2 Languages 


i. 
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Check whether the following are languages: 
a.€ b.@ c.analphabet d. {0,1} 
Specify the number of strings in the following, when X = {0, 1}: 
ao brs? ces a Ds. 
Specify the number of strings in the following languages, where & = {1, 2, 3}: 
a D299 bo «rs 
Let L} = {xl#o(x) = #1(x)} and Lz = {o!l!|i > 0}. Then show that L} N Lz = Lp. 
Let L; = {xl#o(x) = #;(x)} and L = (011i > 1}. Show that L; @ Lz = L; — Ly. 
Let L = {0, 1}*. Compute L=?, L=* and |L="| for i € N. 
Let L = {x|x € (0, 1}* and x has 001 as a prefix. What are L=°, L=3, L<4? 
What is {€, 01,011} U {e€, 01, 10,001}? 
For any languages L), L2 and L3, prove that (L; U Lz) UL3 = L; U (L2 UL). What 
is this property called? 
Is there a language L such that LoL = L? 
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11. Specify an infinite language L such that L = L¥. 
12. Let I be any alphabet and L a language over I’. Then prove that L = L. 
13. If L = {€,00,01, 10, 11,07,--- ,111---}, find Z over the alphabet {0, 1}. 
14. Find L; @ Ly, where L; = {bc, cc, ccc} and Lz = {c, bc, cc}. 
15. For any two languages L; and L2, prove the following: 

a. Ly Lo =12N1,,L,;NL,; =h,LN¢d=¢ 

b. Li @L2 = (Li VL) — (Li NL) 
16. What is the concatenation of {€, 10, 101} and {0, 001, 11}? 
17. Does the kleene star of a language always result in a ‘bigger’ language? 
18. Is L+ = L*? Explain. 
19. Consider a language L*, where L = {ab,cd} with £ = {a,b}. Then answer the 

following: 

a. write all the words in L* that have seven or less letters/symbols. 

b. What is the shortest string in £* that is not in the language L*? 
20. Give an example of a set S of strings, such that: 

a.S*=St b.S*#S*t co. S=S* d.SH#S* e.S* is finite 
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Regular Expressions and Languages 
‘Every Regular expression is associated with some language.’ 


Introduction 


The power of computers is estimated with regard to their (a) speed (b) storage capacity and 
(c) amplification capability of codes. Suppose, we have a directory called public - html 
containing a number of personal world wide web files like gif, midi, html etc., without any 
subdirectories. Then, the reorganisation of the directory, by creating subdirectories for each 
type of file could be done by using a pattern-matching mechanism. In other words, a pattern 
is used to extract (or filter) out a subset of files from a larger collection of files. For example, 
*, midi might generate the following subset of files: 

Karn.midi, Chen.midi, Andh.midi, ..... 

In other words, one command works whether there are 3 or 3000 .midi files. Our goal is 
to develop a way of specifying a filter, that extracts out a desired subset from a larger set by 
using the language operations discussed in chapter 2. 

We shall show that the basic set of operations on languages provides us with a 
mathematical way of specifying a filter. 

Suppose & represents the alphabet of all possible characters that appear in file names. 
Then obviously, all the file names, that begin with 1, can be given by the expression: 


{l}o x*. 
Accordingly, all the files that end with .midi can be given by the expression: 
=* o {.midi}. 
Then, clearly 
Leiter = {1} 0 X* o {.midi} 


represents the set of strings that start with 1 and end in .midi. 


3.1 Regular Languages 


Regular Languages are those that can be constructed from the simple set of operations 
of Union, Concatenation and Kleene star. Every regular language can be expressed as an 
expression (or formula) by using the set operations of union, concatenation and kleene star. 
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3.1.1 Definition 


Suppose & is an alphabet. Then, the class of regular languages over & is inductively defined 
in the following manner: 


(i) 
(ii) 
(iii) 


(iv) 


(v) 
(vi) 


¢ is a regular language. 


For each a € &, {a} is a regular language. 

If L;,L2,...,L, are regular languages (where n is any natural number), then so is 
n 

U Ln. 

i=] 

If Lj, L2,...,L, are regular languages (where n is any natural number), then so is 


Li oL2,0...0Lp. 
If L is a regular language, then so is L*. 
Nothing else is a regular language, unless it is constructed using points (i) — (v). 


Each of the above formulae is called a regular construction and regular languages are those 
that are generated by regular constructions. While writing the above formulae in a compact 
form, the following convention is adapted: 


a 
b. 
c. 


. the shorthand for {a} is a 


the shorthand for x 0 y is xy 
only when it is necessary to override the normal precedence of *, over o and over U, 
parentheses are to be used. 


EXAMPLE 3.1.1: (Regular languages) 
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Let T = {aj,..., pr} be an alphabet. Then {ox<;} is a regular language (by rule (1)) 
for 1 <i < R. By applying rule (iii) to these singleton sets, we see that I’ is a regular 
language. 

Let L = {b, bc} be a language over & = {b,c}. Then by rule (ii), both {b} and {c} 
are regular languages and by rule (iv), {b} o {c} = {bc} is a regular language. Again 
by using rule (iii), {b} U {bc} = L is a regular language. 

{e} is a regular language and ¢ is also a regular language (by rule (i)). Then by 
applying rule (v), we observe that ¢* = {€} is a regular language. 

Let L = {€,07, 04, 0°, .. .}. Then, by rules (i) and (ii), {00} is a regular language and 
so is {00}* (by rule (V)). Further, L is a regular language, since L equals {00}*. 

The language {€, 0, 1,00, ---} U fe, 1, 11,---} = {0 U 1}* or can be constructed by 
{OU 1}* U1*. 

The language {é, b, bb, - - -.} is constructed by b*. 
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g. Observe that the language over the alphabet & = {0, 1}, that consists of all strings 
having the pattern of 01 as a substring, can be constructed easily by 


(0U 1)*010U 1)*. 


h. Observe that the language over the alphabet & = {0,1}, whose strings contain an 
even number of Os, can be constructed by 


(1*(1*)01"*))*), 


or simply 1*(01*01*)*, where € is included in this language. 


3.1.2 Standard Representations of Regular Languages: 


m@ Regular Expression 

m DFAs 

m NFAs 

@ Regular Grammars 

Note-1: 

We discuss the properties of regular languages in chapter 7, next we prove that certain 


languages are not regular using pumping lemma in chapter 12 and finally discuss the 
regular language and its relationship with formal grammar in chapter 15. 


3.2 Regular Expressions 


In this section, an easy-to-use notation for describing the construction of sets of strings, 
(using the basic language operations) is discussed together with its grammar (or syntax) and 
interpretation (semantics). 


3.2.1 Definition (Regular expression) 


Certain sets of strings or languages can be represented in an algebraic fashion, and these 
algebraic expressions of languages are called regular expressions (R.E). 

A regular expression is a string that describes the whole set of strings according to a 
certain syntax rule. These expressions are used by text editors and utilities (in particular, 
UNIX OS), to search bodies of text for certain patterns. 
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3.2.2 Standard Regular expression 


Suppose © is an alphabet. Then, the standard regular expressions, over the alphabet Z, are 
defined (inductively) as follows: 


The string () is a regular expression. 

For each o € &, the string symbol o is a regular expression. 

If R is a regular expression, then (R) and R* are regular expressions. 

If R; and R2 are regular expressions, then (R1|R2) and (R, Ro) are regular expressions. 
Regular expressions are formed only through rules (a) through (d). 


ono Sf 


Mathematically, each regular expression is a string over the alphabet ¥ U {(, ), |, *}. From 
the foregoing discussion, it is observed that each regular expression is merely a string of 
symbols assembled using the syntactic rules of definition—(a) to (e). A grammatical variable 
indicates a point, where the definition is expanded in a recursive manner. 

The meaning of a regular expression is the particular regular construction that it describes. 
For example, (a|c)* describes the construction ({a} U {c})*. But in a broader sense, the 
meaning of a regular expression is the anguage it constructs. 


3.3. Components of Regular Expressions 


The following are the notations used in regular expressions: 


Sometimes ‘+’, ‘o’ and ‘*’ are used to represent union, concatenation and kleene star 
closure respectively. A formal definition to regular expressions can be given as: 


a. @,€ and a belonging to &, are all regular expressions and are called the primitive 
regular expressions. 

b. IfR, and Ro are regular expressions, then Rj + R2,R1 oR2, Rj and (R}) are all regular 
expressions. 

c. Astring is a regular expression, if and only if, it can be derived from primitive regular 
expressions by a finite number of applications of rule (b). Thus, by the repeated 
application of the rules, regular expressions are constructed. 


EXAMPLE 3.3.1: (Regular expressions) 


a. If & = {a,b,c} is some alphabet, the regular expression can be built from these 
symbols as 
(a+ bc)* o(c+@). 


b. If Rj =c and R2 = ¢, thenc + ¢ 1.-e. Rj + R2 is also a regular expression. 
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c. o(a + b+) is not a regular expression, since there is no way by which a regular 
expression can be constructed from the primitive regular expression. 

d. If & = {0,1}, then the following are the regular expressions that can be built from 
it: 


01 means, a zero followed by one concatenation (0 and 1). 
0 + 1 means, either a zero or a one (union) 

0* means, € +0 + 00 + 000 + --- (star kleene) 

mw 1* means, 1 + 11+111+--- (positive closure) 


e. If X& = {0, 1}, then using the parentheses, the following regular expressions could be 
constructed: 


m= (0+ 1)* means, set of all strings over 0 and 1. 
m 0*10*10* means, strings containing exactly two ones. 
mw (0+ 1)*11 means, strings which end with two ones. 


EXAMPLE 3.3.2: (Describing regular expressions in English) 


Regular Expressions Meaning 
Set of strings of zeros of any length including € 
Set of strings of zeros of any length excluding € 


Set of strings of a’s and b’s of any length including € 


(a+b) Set of strings of a’s and b’s of any length excluding € 
(a + b)*abb | Set of strings of a’s and b’s ending with abb 
ab(a + b)* Set of strings of a’s and b’s starting with ab 
Orr" Set of strings of any number of zeros, followed by 
any number of 1’s, followed by any number of 2’s. 
Orr 2 Set of strings of any number of zeros, followed by 


any number of 1’s, followed by any number of 2’s 
excluding €. 


OO*11*22* Set of strings of 0’s, 1’s and 2’s with atleast one zero, 

followed by atleast one 1, followed by atleast one 2. 

(a+ b)*aa(a + b)* Set of strings of a’s and b’s of any length having a 
substring aa. 

a* b(a*ba*b)*a* A string which starts and ends with a or b, and has 


atleast one b, after the first b. Also, all b’s in the string 
appear in pairs. Any number of a’s can appear in any 
place in the string. Thus it is the set of strings over the 
alphabet {a, b}, that contains an odd number of b’s. 
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((a + b)?)*(€ +a + b) — ((a + b)?) Represents the strings of length 3. Here, ((a + | 
b)**) represents the strings of length, which is a 
multiple of 3. Since ((a+b)?)*(a+b) represents 
the strings of length 3n + 1, where n is a natural 
number, the given regular expression represents 
the string of length 3n + 1. 

(b + ab)*(a + ab)* — (b + ab)* (b+ab)* represents strings, which do not contain 
any substring aa and strings which end in b. 
(a+ab)* represents strings, which do not contain 
any substring bb. Hence, altogether it represents 
any string containing a substring with no aa 
followed by one b, followed by a substring with 
no bb. 


EXAMPLE 3.3.3: (Forming a regular expression) 
Construct a regular expression (RE) for the following: 


a. RE containing even number of Os 


Solution: 
RE = (00)*. 
b. RE that generates odd number of Is 
Solution: 


Recall that (11)* indicates RE which generates even number of 1s. Now concatenate 
even number of 1’s with 1 to generate odd number of 1’s, i.e., 


RE = (11)*.1 


c. RE to generate a string with any number of zeros, followed by any number of ones 
but starts with 00 
Solution: 
RE = 00(0 + 1)*. 
d. RE to generate a string with any number of 0’s followed by any 1’s and ends with 0 
11 


Solution: 
RE = (0+ 1)*011 


e. RE to generate a string with even number of a’s followed by odd number of b’s 
Solution: 
RE = (aa)*(bb)*b. 
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f. RE to generate a string with any number of 0’s and 1’s and having atleast one pair 
of consecutive zeros 
Solution: 
RE = (0 + 1)*000 + 1)*. 


g. RE to generate a string, containing a substring aba 
Solution: 
RE = (a+ b)*aba(a + b)*. 


h. RE to generate a string of symbols, having even number of characters 
Solution: 
RE = (aa+ab+ba+bb)* or RE = (ab)*. 


i. RE to generate a string, containing odd no of a’s and odd number of b’s 
Solution: 
RE = a(aa)* - (bb)*b 


EXAMPLE 3.3.4: (Forming a regular expression) 


a. Set of all words of the form: one ‘a’ followed by some number of b’s (possibly zero) 
RE => R= ab* 

b. Set of all words of the form: some positive number of a’s followed by exactly one b 
RE = R = aa*b 


c. Set of all strings of a’s and b’s that have atleast two symbols, begin and end with one 
a and have nothing but b’s inside 


RE = R = ab*a 
d. Set of all words over & = {a, b} 
RE = (a+ b)* or (a*b*)* or (€ +a + b)* or(a+b)t 
e. Set of even number of x’s (possibly zero) 
RE => R = (xx)* 
f. Set of all positive even number of x’s 
RE => R= (xx) 
g. Set of all odd numbers of x’s 


RE => R = (xx)*x 
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. Set of all three-lettered words, starting with b over & = {a, b} 


RE => R = baa + bab + bba + bbb. 


. Set of all words starting with a and ending with b 


RE > R=a(a+b)*b 


j. Set of all words starting and ending with b 


RE => (b+ b(a+ b)*b) 


. Set of all words with exactly two b’s 


RE => R = a*ba*ba* 


. Set of all words with atleast two b’s 


RE > R= (a+b)*b(a+ b)*b(a + b)* 


. Set of all words with atleast one a and atleast one b 


RE > R= (at+b)*a(a+b)*b(a + b)* + bb*aa* 


. Set of all words consisting of either all a’s or b’s, followed by a non-negative number 


of a’s 
RE => R = a* + ba* 


. Set of all words with no two consecutive a’s, 


RE > R = (b + ab)* (a+ €) 


. All strings with atleast two consecutive zeros over © = {0, 1} 


RE => R= (0+ 1)*00(0+ 1)* 


. All strings that end in double letters over D = {a, b} 


RE => R= (a+b)*(aa+ bb) 


. All strings that do not end in double letter, over & = {a, b} 


RE > R=€ +a+b+(a+b)*(ab + ba) 


. All strings that have exactly one double letter in them, over & = {a, b} 


RE => R = (€ +b)(ab)*aa(ba)*(€ +b) + (€ +a)(ba)*bb(ab)* (€ +a) 


. All strings that do not end with ab, over & = {a, b} 


RE > R= (a+b)*(a+ bb) 


. All strings over & = {a,b} that contain, not more than one occurence of the string 


aa 
RE => R = (b +. ab)*a(b + ba)* 
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3.4 Languages Associated with Regular Expressions 


A language represented by a regular expression defines a regular language. In other words, 
a regular language is a language that can be represented by a regular expression. If ‘R’ is 
a regular expression, then L(R) denotes the language associated with R. This language is 
defined formally as follows. 


3.4.1 Definition 


The language L(R) denoted by any regular expression R is defined by the following rules: 


a. gis aregular expression denoting the empty set. 
b. € is a regular expression denoting {€} 


c. For every a € ¥, ais a regular expression denoting {a}. 


EXAMPLE 3.4.1: Describe the language defined by the following regular expressions: 


a. R=ab*a 
Solution: 
The language L, defined by regular expression ‘R’, is the set of all strings of a’s and 
b’s that begin and end with a’s and have nothing but b’s inside 


i.e L(R) = {aa, aba, abba, abbba, abbbba}. 
b. R=a*b* 
Solution: 


The language L, defined by regular expression ‘R’ is the set containing the strings of 
a’s and b’s, in which all a’s (if any) come before all b’s (if any) 


i.e L(R) = {é,a, b, aa, ab, bb, aaa, aab, abb, bbb, aaa}. 


3.5 Properties of Regular Expressions 


If R; and R2 are regular expressions, then 
a. L(R) + R2) = L(R1) UL(R2) 


Example: 
If Rj = ab*, Ro =ab*a then L(ab* + ab*a) = L(ab*) U L(ab*a) 
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b. L(R, - R2) = L(R}) - L(R2) 
Example: 
If Rj =a*, Ro =aba* then L(a* - aba*) = L(a*) - L(aba*) 
c. L(L(R1)) = L(Ri) 
d. L(RY) = (L(R1))* 
e. L(R)- € = € -L(R) = L(R) 


3.5.1 Algebra of Regular Expressions 


Let R,S and T be any arbitrary regular expressions. Then, the following properties 


are true: 

a. (R+S)+7T=R+(S4+T). 

b. R+R=R. 

c R+P=G+R=R. 

d. R+S=S+R. 

e. Rp = oR= ¢. 

f. RRE=E€.R=R. 

g. (RS)T = R(ST). 

h. RIS+T) =RS+RT. 

i. (S+T)R=SR+TR. 

j. @* =e*=e. 

k. R*.R* = R* = (R*)*. 

l R-R* = R*R = R* =e +RR*. 
m. (R+S)* = (R*S*)* = (R* + S*)*. 
n. (RS)* = (R*S*)* = (R* + S*)*. 


3.5.2 Basic operations of Regular Expressions: 
Let R and S be any two regular expressions. 


a. Concatenation - RS denoting the set {xy|x € R and y € s} 


Example: 
If R = {ab,c} and S = {d, ef}, 


then RS = {ab,c} - {d, ef} 
= {abd, cd, abef , cef}. 
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b. Union - The union R U S denotes the set union of R and S. 
Example: 
If R = {ab,c} and s = {d, ef} 


then RUS = {ab,c} U {d, ef} 
= {ab, c,d, ef} 


c. Kleene closure or star closure - R* denotes the smallest superset of R that contains 
€ and is closed under string concatenation. This is the set of all strings, that can be 
made by concatenating zero or more strings in R. 


Examples 


(i) {ab,c}* = {€,ab,c, abab, abc, cab, cc,ababab,....} 
(ii) {01,2}* = {e€,01,2,0101, 012, 201, 22,010101,012012,....} . 


EXAMPLE 3.5.1: Form the string set for the regular expression given below. 


a. 1*0 
Solution: 
1*0 = {e,1,11,111,...}-0 
= {0, 10, 110, 1110,...} 
b. 00* 
Solution: 
00* = 0- {e, 0, 00, 000} 
= {0,00, 000, 0000, . . .}; note that 00* = OT, 
c. 10*1 
Solution: 
10*1 = 1- {€,0,00, 000, ...}- 1 
= {1,10, 100, 1000,...}-1 
= {11, 101, 1001, 10001,...} 
d. (100+)* 
Solution: 


(100*)* = (10 - {0,00, 000, . . .})* 
= {100, 1000, 10000, . . .}* 
= {€, 100, 100100, 100100100, 1000, 10001000, . . .} 
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. a*b* 
Solution: 
a*b* = {€,a,aa,...}- {€,b, bb,...} 
= {é€, b, bb, a, ab, abb, aa, aab, aabb, .. .} or 
=: {€,a,aa, b, bb, ab, abb, aab, aabb, . . .} 
. (0+1* 
Solution: 
(0+ 1)*=0*U1* 
= {€,0,00,...} U fe, 1, 11,...} 
= {€,0,00,1,11,...} 
. (a+c)b* 
Solution: 
(a +c)b* = ab* U cb* 
= a{ée,b, bb,...} Ucfe,b, bb, ...} 
= {a, ab, abb, ...} U {c,cb, cbb,.. .} 
= {a,c, ab, cb, abb, cbb, ...} 
. (1+10)* 
Solution: 
(1 + 10)* = 1* U (10)* 
= {1*, (10)"} 
(0+ 1)*011 
Solution: 


(0+ 1)*011 = (0U1)*011 
= 0*011U 1*011 
= {€,0,00,...}011 U {e, 1, 11,...JO11 
= {011,0011,00011,...} U {O11, 1011, 11011,...} 
= {011,0011,00011, 1011, 11011,...} 
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j. (O* +1*)* 
Solution: 
(0* + 1*)* = (0* U1*)* 
= ({€,0,00,...} U {e, 1, 11,...})* 
= {€,0,00, 1, 11,...}* 
= {0*, (00)*, 1*, (11)*,...} 


k. (0+ 1)* 00 (0+ 1)* 
Solution: 


(0 + 1)* 00 (0+ 1)* = 0* U1" 000* U1" 
= {0*, 1*} 00 {0*, 1*} 
= {0*00, 1*00} - {0*, 1*} 
= {0*000*, 1*000*, 0*001*, 1*001*} 


lL r=(+4+1)-(0+1)-0+1)* 
Solution: 


r=0[00+1)-@+)] or 1[O+1)-O+1)]* 
= 0[0(0 + 1) or 1(0 + 1)]* or 1100 + 1) or 10+ 1)]* 
= O[{00, 01}, {10, 11}]” or 1[{00, 01}, (10, 11}]* 
= 0[(00)*, (01)*, (10)*, (11)*] or 1[(00)*, (01)*, (10)*, (11)*] 
= {0(00)*,0(01)*,0(10)*, 0(11)*, 1(00)*, 1(01)*, 1(10)*, 1(11)*} 


EXAMPLE 3.5.2: Find the language, for the regular expressions given below: 
a. R=a*(a+b) 


Solution: 
L(R) = L(a* - (a+b)) 
= L(a*)-L(a+b) 
= (L(a))* - (L(@) UL(b)) 
= {€,a,aa,aaa,....} - {a,b} 


L(R) = {a, aa, aaa, b, ab, aab, aaab,...}. 
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b. R= (a+b)*(a+ bb) 


Solution: 
L(R) = L((a+ b)* - (a+ bb)) 
= L(a + b)* - L(a + bb) 
= (L(a+ b))* -L(a + bb) 
= (L(a)* UL(b)*) - (L(a) U L(bb)) 
= ((L(a))* U (L(b))*) - (L(@) U L(bb)) 
= ({€,a,aa,...} U {e, b, bb, ...}) - ({a, bb}) 
= {é,a,aa,b, bb, ...} - {a, bb} 
L(R) = {a, aa, aaa, bb, abb, bbb, bbbb, . . .}. 
c. R=a+b 
Solution: 
L(R) = L(a+ b) 
= L(a) UL(b) 
= {a} U {b} 
L(R) = {a, b}. 
d. R=a+b* 
Solution: 


L(R) = L(a+ b*) 
= L(a) UL(b*) 
= L(a) U (L(b))* 
= {a} U {e,b, bb,...} 
L(R) = {€, a,b, bb,...} 
e. R= a*be* +ac 
Solution: 
L(R) = L(a*bc* + ac) 
= L(a*bc*) U L(ac) 
= (L(a*) - L(b) - L(c*)) U (L@ - L(@©)) 
= ((L(a))* - L(b) - (L(c))*) UL@ - L@)) 
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= ({€,a,aa,...}- {b} - {c}) U ({a} - {c}) 
= ({b, ab, aab, .. .} - {c}) U ({ac}) 
= {bc, abc, aabc, .. .} U {ac} 


= {bc, abc, aabc, ...,ac}. 


f. R= ab*a 
Solution: 
L(R) = L(ab*a) 
= L(a) - L(b*) - L(a) 
= {a} - (L())* - {a} 
= {a} - {€,b, bb, bbb, .. .} - {a} 
= {a, ab, abb,...} - {a} 
= {aa, aba, abba, ...} 
g. R=a*b* 
Solution: 


L(R) = L(a*b*) 
= L(a*) - L(b*) 
= (L(a))* - (L(b))* 
= {é€,a,aa,...}-{€,b, bb,...} 
L(R) = {€, a, aa, b, ab, aab, bb, abb, aabb, ...}. 


or 
L(R) = {€, a, b, aa, ab, bb, aaa, aab, abb, bbb, aaa, . . .}. 


EXAMPLE 3.5.3: Give the regular expressions for the following languages: 
a. L(R) = {a,b,c} 
Solution: 
R=a+b+ec because L(R) =L(a+b+c) 
= L(a) UL(b) UL(c) 
= {a,b,c}. 
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b. L(R) = {a,b, ab, ba, abb, baa, . . .} 
Solution: 


R = ab* + ba* because L(R) = L(ab* + ba*) 
= L(ab*) U L(ba*) 
= {a} - {€, b, bb} U {b} - {€, a, aa} 
= {a, ab, abb} U {b, ba, baa} 
= {a, b, ab, ba, abb, baa} 


c. L(R) = {é,a, abb, abbbb, .. .} 
Solution: 


R =€ +a(bb)* because L(R) = L(€ +a(bb)*) 
= L(€) UL(a)(L(bb))* 
= {€} U ({a} - {€, bb, bbbb}) 
= {€} U {a, abb, abbbb} 
= {¢,a, abb, abbbb} 


d. L(R) = {0,1, 10, 11, 100, 101, 110, 111,...} 
Solution: 


R=0+1(0+1)* because L(R) = L(0+1(0+1)*) 
= L(0) U (L(1) - L(0 + 1)*) 
= {0} U ({1} -L(0*) UL(1*)) 
= {0} U ({1} - {€, 0, 00} U {e, 1, 11}) 
= {0} U ({1} - {E, 0, 00, 1, 11}) 
= {0} U {1, 10, 100, 11, 111} 
= {0, 1, 10, 100, 11, 111} 
e. L(R) = {aa, ab, ba, bb} 
Solution: 
R=aa+ab+ba+bb 
or 
R=(a+b)(a+b) 
f. L(R) = {a7"b?™+1\n > 0,m > 0}. 
This language denotes set of all strings with an even number of a’s followed by 


an odd number of b’s 
=> R= (aa)*(bb)*b. 
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g. If = {0,1} and L(R) = {we = 


w has atleast one pair of 
consecutive zeros . 


=> R= (0+ 1)*00(04+ 1)*. 


h. If © = {0,1} and L(R)= { € &* 


w has any number of 0’s or 1’s 
and ends with 011 . 


=> R= (0+ 1)*011. 


w has 0’s 1’s and 2’s with atleast 
i. If ={0,1,2} and L(R) = {we &X*] one zero followed by atleast one 1 
followed by atleast one 2 


=> R = 00*11*22". 


j. If = {a,b} and Lie) = {we > 


w has a’s and b’s having 
a substring aa : 


=> R=(a+b)*aa(a+b)*. 


EXAMPLE 3.5.4: 
a. Find the shortest string that is not in the language, represented by the regular 
expression a*(ab)*b*. 
Solution: 
Let R = a*(ab)*b* 
L(R) = L(a)* - L(ab)* - b* 
= {€,a,aa} - {€, ab, aab, ...} - {b} 
= {b, ab, aab, abb, aabb, aaabb, aabb, aaabb, aaaaabb} 
It is seen that string €, a, b are with length, one or less, in the language. The strings 


with length two are aa, bb, ab. However, ba is not in the language L(r). Hence, ‘ba’ 
is shortest string, not in the language. 
b. For the two regular expressions given below, 
i. find a string corresponding to r2 but not to 7; and 
ii. find a string corresponding to both 7; and r2. 
ry =a* +b* r2 = ab* + ba* + b*a + (a*b)* 
Solution: 
i. Any string consisting of only a’s or only b’s and the empty string are in r). So, 
we need to find strings of r2 which contain atleast one a and atleast one b. For 
example, ab and ba are such strings. 


71 


Downloaded from https:/www.cambridge.org/core. University of Birmingham, on 17 Apr 2017 at 08:02:51, subject to the Cambridge Core terms of use, available at 
https:/www.cambridge.org/core/terms. https://doi.org/10.1017/UPO9788175968363.004 


A Textbook on Automata Theory 


ii. A string, corresponding to r;, consists of only a’s or only b’s or the empty 
string. The only strings corresponding to r2, which consist of only a’s or b’s are 
— a,b and the strings consisting of only b's (from (a*b)*). rz does not contain 
the empty string. 


3.6 Uses of Regular Expressions 


a. In general, the application of regular expressions can be classified as follows: 


@ Validation: Checking the correctness of inputs i.e., whether the given string 
complies with a set of formatted constraints or not. 


@ Search and Selection: Identification of a subset of items from a larger set, on the 
basis of a pattern match, i.e., a regular expression (e.g.: use of the grep command) 
to locate files and lines within files. 

@ Tokenisation: Conversion of a string of characters into a sequence of words for later 
interpretation, i.e., the generation of program called lexical analyser (generated 
by the Jex command) that performs lexical processing of its character input. The 
first step of compilation called lexical analysis is to convert the input from a simple 
sequence of characters into a list of tokens of different kinds—like numerical and 
string constants, variable identifiers and programming language keywords. Here, 
the source code for a lex program is a table of regular expressions, coupled with 
corresponding actions. 


b. RE offers a declarative way (algebraic definition) to express set of strings. 
Example: set of strings that consists of single 0 followed by any number of 1s or 
single 1, followed by any number of Os i.e. 01* + 10*. 


c. RE are used to define languages. They define the languages accepted by finite 
automata, exactly. 


3.6.1 Regular Expression in Lexical Analysis 


The process of translation by compilers is called compilation. This process is quite a complex 
task and, hence is divided into a number of phases as shown in figure 3.1. 

Lexical analysis is the first step of compilation. In this phase, the compiler scans the 
characters of the source program, one character at a time. When it gets sufficient number of 
characters to constitute a token of the specified language, it outputs that token. 

In order to perform this task, the lexical analyser must know the keywords, identifiers, 
operators, delimiters and punctuation symbols of the language to be implemented. So, when 
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Source program 


Lexical Analysis 
Syntax Analysis 
Intermediate Code 
Generation 
Code 
Optimisation 


Object Code 


Figure 3.1. Compilation Process Phases. 


it scans the source program, it will be able to return a suitable token. Therefore, the lexical 
analyser design must, 


a. specify the token of the language, and 
b. suitably recognise the tokens. 


Therefore, the first thing that is required, is to identify the keywords, identifiers, operators, 
delimiters and punctuation. These are the tokens of the language. Once identification of the 
tokens of the language is done, it is important to use suitable notation to specify this tokens. 
The regular expressions are used to specify this notation. The RE can be used to specify a set 
of strings and a set of strings that can be specified by using RE notation is called a ‘regular 
set’. Thus, RE for things like operators, keywords, identifiers etc. of a typical programming 
language, are as follows: 


digit = [0 - 9] 

alphabet = [a-z, A-Z] 

identifier = [a-z A-Z] [a-z A-Z 0-9]* 

keyword = “if” | “else” | “while” | “int” | “float” | “main” | “void” 
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operators = +| — | *|/| mod | div 

boolean = “true” | “false” 

comment = “//” [a-z A-Z]* (“\r? | “\ n” | “\ r\n”) 
whitespace =\n|\r|\t 


The advantage of using RE notation for specifying tokens is that, when RE are used, the 
recogniser for the tokens ends in finite automata (DFA). Thus, designing a lexical analyser 
becomes a simple process of transforming RE into finite automata and generating the 
program, for simulating the finite automaton. This process is done by using a software 
tool called LEX. 


3.6.2 Traditional Unix Regular Expressions (regexps) 


The ‘basic’ Unix regexp syntax has been superceded by POSIX, but is still widely used for 

the purposes of backwards compatibility. Most Unix utilities (grep, sed...) use it by default. 
In this syntax, most characters are treated as literals - they match only themselves (“a” 

matches “a”, “abc” matches “abc”, etc). The exceptions are called metacharacters: 


e Matches any single character 
Matches a single character that is contained within the brackets - [abc], matches 
0) “a”, “b”, or “c”. [a — z], matches any lowercase letter. These can be mixed: 


[abcq — z] matches a, b,c, q,r,S,t,u,v,w,x, y, z, and so does [a — cq — 2]. 

[A] Matches a single character that is not contained within the brackets - [abc], 
matches any character other than “‘a”, “b”, or “c’”’. [‘a — z], matches any single 
character that isn’t a lowercase letter. As above, these can be mixed. 


A Matches the start of the line (or any line, when applied in multiline mode) 
$ Matches the end of the line (or any line, when applied in multiline mode) 
\() Marks a part of the expression. The match of enclosed expression can be 
recalled by \n, where 7 is a digit from 1 to 9. 
\n Matches to the exact string, what the expression enclosed in the n’th left 


parentheses and its pairing right parentheses has been matched to. This construct 
is theoretically irregular and has not been adopted in the extended regular 
expression syntax. 


mw A single character expression followed by “*” matches to zero or more copies 
6699 665,99 66,99 66, 99 66, ” 
’ 


of the expression. For example, “[xyz]*” matches to “”, “x”, “y”, “zx”, “zyx”, 
and so on. 


a A \n*, where n is a digit from 1 to 9, matches to zero or more iterations of 
the exact string, what the expression enclosed in the n’th left parenthesis and 
its pairing right parenthesis has been matched to. For example, “\(a??\)\ 1” 
matches to “abcbc” and “adede” but not “abcde”. 
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m An expression enclosed in “\(’ and “\)’, followed by “*” is deemed 
to be invalid. In some cases (e.g. /usr/bin/xpg4/grep of SunOS 5.8), 
it matches to zero or more iterations of the same string, which the 
enclosed expression matches to. In other cases (e.g. /usr/bin/grep of 
SunOS 5.8), it matches to what the enclosed expression matches to, followed 
by a literal “*”’. 
\{x,y\} Matches the last ‘block’, atleast x and not more than y times. - “a\ {3,5\}”, 


matches “aaa”, “aaaa” or “aaaaa’’. Note that this is not found in some instances 
of regex. 


EXAMPLE 3.6.1: 


“at” matches any three-character strings, ending with “at”. 

“the]jat” matches “hat” and “cat”. 

“{“bjat” matches any three-character strings, ending with “at” and not beginning 
with ‘b’. 

“TheJat” matches “hat” and “cat” but only at the beginning of a line. 

“{hcjat$” matches “hat” and “cat” but only at the end of a line. 


POSIX modern (extended) regexps: The modern “extended” regexp, can often be 
used with modern Unix utilities, by including the command line flag “-E”. 

POSIX extended regexps are similar in syntax to the traditional Unix regexp, with some 
exceptions. The following metacharacters are added: 


+ Match the last ‘block’ one or more times - “ba+” matches “‘ba’”, “baa’’, “baaa’” and 
so on. 
? Match the last ‘block’ zero or one times - “ba?” matches “b” or “ba’’. 


| Thechoice (or set union) operator: match either the expression before or the expression 
after the operator - “abc|def” matches “abc” or “def”. 


Also, backslashes are removed: \{...\} becomes {...} and \(...\) becomes (...). 


EXAMPLE 3.6.2: 


“Shc]+at” matches with “hat”, “cat”, “hhat’’, “chat”, “heat”, “ccat” etc. 
“(hc]?at” matches “hat”, “cat” and “at”. 
“([cC]at | [dD]og)” matches “cat”, “Cat”, “dog” and “Dog”. 


Since the characters ’(’,’)’, ’[’,’]’, *.’, ‘*’, ‘2’, ‘+’, ‘*’ and ‘$’ are used as special symbols 
they have to be ‘escaped’ somehow, if they are meant literally. This is done by preceding 
them with ‘\’, which therefore has to be ‘escaped’ this way, if meant literally. 
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EXAMPLE 3.6.3: 
“\.(\(|\))” matches with the string “a.)” 


3.7.1 Regular Languages and Constructions 


1. State whether the following are true or false. 
a O*NLT =O 
b. gt =o 
c. 0*(0* M 1*) = {0, 1}*0* 
d. 0* U1* = {0, i}* 
2. Prove ((¢ o b) U (b* oc)) is a regular language, using the definition 3.2.1. 


3. Suppose the current directory has a large (over 80) number of C files (i.e., files ending 
in the .c extension). 


a. Write a UNIX command to list out all the C files. 

b. Write a UNIX command to copy all the files from the current directory to a 
subdirectory. 

c. Write a UNIX command to delete all C files from the current directory. 


3.7.2 Regular Expressions 


1. Write a standard regular expression that represents each of the following languages 
in its shortest form: 
a. * 
b. (0*1*)* 
c. (0U1)*00U1)*10U 1)* 
d. QU1)*UL* UM U0)* 
2. Suppose & = {0,1}. Write the shortest regular expression (in terms of number of 
symbols) for each of the following languages: 
The empty language 
The language consisting of only the empty string 
The language consisting of all strings of length three or more 
The language consisting of all strings containing the pattern 010 
The language consisting of all strings where 0s may occur only in the even 
positions 


ono, 
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3. Is the set of file names, contained in a given UNIX directory, regular? Justify your 
answer. 


3.7.3 Applications of Regular expressions 


1. Write an extended RE to validate an input that contains (a) exactly four decimal 
integers, the first of which must be positive and (b) a floating point number possibly 
with an exponent. 

2. Write a grep expression to print all the words in the file /usr / dict / words that contain 
the letters (a) e,/,m and t in alphabetical order and = (b) e, t, m, / in any order. 

3. How many times does the word “UNIX” appear in the grep man page? 

4. Write a grep expression to search all .html files in the directory public—html for the 
pattern house. gif. 
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Finite Automata 


‘Welcome to the wonderful world of finite state machines.’ 


Introduction 


Computation is a concept common to all computing machines, regardless of the messy 
details associated with their hardware implementation. However, actual computing 
machines/computers are too complicated (due to the several constraints caused by physical 
reality) for a manageable mathematical theory to be ascribed to them. Therefore, in order 
to fully understand the power and limitation of real machines, idealised computers or 
computational models are designed and studied. These idealised computers may be accurate 
in some ways but perhaps not in others. 

There are several computational models and the purpose of a computational model is 
to capture the computational aspects that are relevant to the particular problem under 
consideration while hiding the other unimportant aspects. Thus, a computational model 
can be thought of as a custom machine designed to suit particular needs. Some of the 
important computational models are — deterministic finite automaton (DFA), the non- 
deterministic finite automaton (NFA), the deterministic pushdown automaton (DPDA), the 
nondeterministic pushdown automation (NPDA), the deterministic Turing machine (DTM) 
and the nondeterministic Turing machine (NTM). Undoubtedly each of these models has a 
special significance in the theory of computation. The most basic computational model is 
the deterministic finite automaton (DFA). 


4.1 Finite Automata 


As discussed in chapter 1, finite automaton is a mathematical model of a system with 
discrete inputs and outputs. Such a system can be in any one of the finite number of 
internal configurations or ‘states’ and each state of the system provides sufficient information 
concerning the past inputs so that the behaviour of the system could be studied on the 
provision of subsequent inputs. There are many examples of finite automaton like, the control 
mechanism of an elevator, the controller for an automatic door, human brain, controllers of 
various household appliances, digital watches, calculators and many examples in computer 
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science. In fact, the computer itself can be viewed as a finite state system. Switching circuits, 
which are the control units of a computer, are designed in such a way that their logical design 
is separated from the electronic implementation and hence they can be viewed as finite state 
systems. Also, certain commonly used programs like text editors, lexical analysers etc., 
which are found in most of the compilers, are finite state systems. 

In order to locate the strings of characters corresponding to identifiers, reserved words, 
numerical constants etc., a lexical analyser scans the symbols in a computer program. Thus, 
in the design of efficient string processors, the theory of finite automata is extensively used. 

Theoretically, the state of the CPU, main memory and auxiliary storage at any given time 
correspond to a very large but finite number of states. However, finite automata are indeed 
good models for computers with an extremely limited amount of memory. Before giving 


a formal definition of finite automaton, an example is considered to illustrate its salient 
features. 


EXAMPLE 4.1.1: (Automatic door Controller) 


Consider the controller for an automatic door which is often found at a supermarket entrance. 
These doors automatically swing open on sensing the approach of a person. A top view of 
an automatic door is presented in figure 4.1. There are two pads—one in front of the door 
and the other located at the rear of the doorway. 


Figure 4.1. A Top View of an Automatic Door 


The presence of a person, who is about to walk through the doorway, is detected from 
the front pad while the rear pad makes the controller hold the door open, long enough for 
not only the person to pass through but also to avoid hitting the persons standing behind the 
door. 

In this system, the controller is in either of two states ‘open’ or ‘closed’ representing the 
corresponding conditions of the door at any time. The four possible input conditions are: 
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Front — a person standing on the front pad. 
Rear — a person standing on the rear pad. 
Both — people are standing on both the pads. 
Neither — No one on either of the pads. 


Boe S 


In figure 4.2 and Table 4.1, the state diagram and the transition table for automatic door 
controller are presented. 


Rear,Both € Front,Rear 
Neither Front Both 


Neither 


Figure 4.2. State Diagram for Automatic Door Controller 


Signal| Neither Front Rear Both 
—_ 


Closed Closed Open Closed Closed 


Open Closed Open Open Open 


Table 4.1. State Transition Table for Automatic Door Controller 


The controller moves from one state to the other, depending upon the input it receives. 
For example, a controller which is initially in ‘closed’ state, after receiving the input signals: 
Front, Rear, Neither, Front, Both, Neither, Rear, Neither, would move through the series of 
states: open, open, closed, open, open, closed, closed, closed. 

Clearly, the controller considered here is a computer with just a single bit of memory, 
capable of recording the two states. But in the case of an elevator, the controller requires 
several bits of memory to keep track of the information with regard to the floor. Thus, the 
design of such finite state systems requires a thorough knowledge of the methodology and 
terminology of finite automata. 
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4.1.1 Finite Automata from a Mathematical Perspective 


In order to develop a precise definition of a finite automaton, the associated terminology for 
describing and manipulating finite automata, the theoretical results that provide the complete 
description of their power and limitations and a good mathematical theory of finite automata 
are needed. 

Consider a finite automaton M as depicted in figure 4.3. 


0 1 


qi 


Figure 4.3. A Finite Automaton M with Three States q\,q2 and q3 


0, 1 


Figure 4.3 is the state diagram of M which has 


three states g1, gz and q3, where q, — Start state, gz — accept state 
the transitions—represented by arrows 

the input string 1101 

the output, which is either accept or reject. 


BoP 


The processing of the input begins in M at the start state. As it receives the symbols from 
the string one by one from left to right, 4 moves from one state to another as directed by the 
transitions. When the last symbol is read, M produces the output ‘accept’ if M is in the accept 
state otherwise the output is ‘reject’. The step-by-step process can be described as follows: 


Step 1: Start state gq; 

Step 2: read 1; transition from q; to g2 
Step 3: read 1; transition from q2 to q2 
Step 4: read 0; transition from q2 to q3 
Step 5: read 1; transition from q3 to qo. 


Output is ‘accept’, since M is in the accept state q2 at the end of the output. 
A careful experimentation with the machine M with a variety of input strings reveals that 


the following are ‘accept’ strings: 


1,01, 11, 0101010101, 100,0100, 110000 and 0101000000; 


The ‘reject’ strings are : 0, 10, 101000. 
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4.1.2 Formal Definition of a Finite Automaton 


A formal definition is needed for two specific reasons: 


a. A formal definition is precise in nature and resolves any ambiguities involved. 
b. A formal definition provides a good notation to think and express clearly. 


According to the formal definition, a finite automation is a list of five objects: set of states, 
input alphabet, rules for moving, start and accept states. Thus a mathematical definition of 
a finite automaton is a five-tuple consisting of the five objects. 


4.2 Deterministic Finite Automata (DFA) 


Definition 
A deterministic finite automaton is a finite state machine where for each pair of states and 
input symbol there is a unique next state. 


4.2.1 Elements of DFA 


The deterministic finite automata exhibits the following five characteristics: 


a finite set of states Q, 

an alphabet & of possible input symbols, 

a transition function 6 such that 6(x, 1) = y where x,y € Qand 1 € &, 
the initial state qO € Q, 

the set of final states (F), where F C Q. 


Note-1: 

The term deterministic refers to the fact that on each input, there 
is one and only one state to which the automaton can transit from 
its current state. 


of Oo oP 


4.2.2 Operation of DFA 


Initially, the DFA is assumed to be in the initial state go with its read head on the left most 
symbol of the input string. During each move of DFA, the read head moves one position to 
the right. Thus, each move consumes one input symbol. When the end of string is reached, 
the string is accepted if DFA is in one of its final states, else rejected. The working of DFA 
is demonstrated in examples 4.2.1 and 4.2.2. 
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EXAMPLE 4.2.1: (DFA that accepts the input string abba) 
Transition Diagram Initial Configuration 
| Input String 


5 


Initial state { { Final 
P: Transition ne ae 
: State accept 
Reading the Input 


aa 


a,b 
‘ 
CLS 
b 
@-@)++@)+@)4 
V: Output: “accept” 


Figure 4.4. P-V: Demonstration of Deterministic Finite Automata to accept the input 
string. 
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EXAMPLE 4.2.2: (DFA that rejects input string aba) 


Figure 4.5. R-V: Demonstration of Deterministic Finite Automata to reject input string. 
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4.2.3 Ordered Quintuple Specification of DFA: 


Formally, a DFA, M is a five-tuple 
M = (Q, 2, 5,40, F) 


where 
. Qisa finite set of states of finite automata, 


x is a finite set of input symbols called the alphabet, 
. 6:Q0x X > Qis the transition function, 
. go € Qis the start state, 
. F C Qis the set of accept states. 


If from a state p, there exists a transition going to state g on an input symbol a, then this 
is written as 
d(p,a) = 4 

where 6 - is a function whose domain is a set of ordered pairs (p, a) 

p-astate 

a - input symbol. 
Thus, 5 defines a mapping whose domain will be a set of ordered pairs of the form (p, a) 
and whose range will be a set of states i.e., 


6:Q@xz-@. 


4.2.4 Description of a DFA 


The transitions of DFA can be represented using transition diagram or table. 

Transition diagram: if 5(p,a) = q, then the arrow goes from the vertex which corresponds 
to state p, to the vertex that corresponds to state q labelled by a. 

Transition table: Rows correspond to states and columns correspond to inputs. Entries 
correspond to next states to indicate the transition of DFA. 


4.2.5 Extended transition function for DFA 
For DFA, M = (Q, 2,5, qo, F) the function ‘3’ is extended as 
6:Q0xx* —-@ 
and is defined recursively as follows: 
a. For any state g of Q 
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This means that DFA stays in the same state g when it reads an empty string at q. 
b. For any state q of Q, any string xe X* with a as the last symbol of x and ae Xx 


6(q,xa) = 5(5(q,x), a) 


4.2.6 Language accepted by DFA 


The language accepted by a DFA, M = (Q, ©, 5, qo, F) is the set of all strings on © accepted 
by M ie., 


L(M) = {W € &*|8(qo, W) € F} 


A language is said to be rejected by DFA if M = (Q, 2, 5, qo, F) such that 


L(M) = {W € &*|5(qo, W) ¢ F} 


Note-2: A machine may accept several strings but it recognises only one 
language. 


4.3 Design of DFAs 


The basic design strategy for DFA is as follows: 
a. Understand the language properties for which the DFA has to be designed. 
b. Determine the state set required. 
c. Identify the initial, accepting and dead state of DFA. 
d. For each state, decide on the transition to be made for each character of the input 
string. 
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e. Obtain the transition table and diagram for DFA. 
f. Test the DFA obtained on short strings. 


EXAMPLE 4.3.1: To design a DFA that accepts set of all strings that contain 0’s or 1’s and 
end in 00. 
Solution: We are required to design a DFA for the regular expression r = (0+ 1)*00... 
In other words the DFA shoud be designed to accept the language of r, 

ie., L(M) = {00, 100, 1100, 0000, 111000, ....} 
where M is a DFA. 
Consider, & = {0,1} 


Q = {40.91.92} 
go = initial state 
q2 = final state and 5:Qx X — Qis given by 


Transition diagram: 


Figure 4.6. State Transition to represent (L(M) = {W & &*|W ends with 00} 


Transition table: 


ak | Present Inputs 


go ql 
qi 


oeetnetn 


qo 
q2 qo 
g2 qo 


Table 4.2 State Transition Table 
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DFA action for the input string 


1. To show that the string 100 is accepted by DFA: 
i. Using sequence state diagram 


a a ia 


Figure 4.7. Sequence State Diagram for 100 


input: 


state: 4o 


Since we encounter the end of the input and we are in the final state, we say that 
string is accepted by machine M. Thus 100 is in L(M). 
ii. Using extended transition function 
Consider, 
m 4(40,1) = Go 
Bm 45(g0, 10) = 5(5(go, 1), 9) 
= (go, 0) 
= q1 
@ 45(go, 100) = 5(5(go, 10), 0) 
=64 (1 ,0) 
= 
Since after scanning the entire string, we reach at the final state g2, the given 
string 100 is thus accepted by the DFA M. Thus 100 is in L(M). 
iii. Using vdash function (F) 


@ An input string a is accepted by + M iff (W € X*| + (g,a) € F). 
g@rk:@Qxi-@Q 


™ (qg,e) means ‘reached end of input’. 


Consider, 


(go, 100) = | M(qo, 00) 
- M(q;;0) 
F M(q2,e) 


Since we reached the end of input (e) and we are in the final state g2, hence 100 
is accepted by M. Thus 100 is in L(M). 
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2. To show that the string 010 is rejected by DFA: 
i. Using sequence state diagram 


“ZN ANAS. 


state: 40 


Figure 4.8. Sequence State Diagram for 010 


Since we encounter the end of input and q; is not the final state, we say that the 
string 010 is rejected by the machine. 
ii. Using extended transition function 
a ) (qo, 0) = 
m 5(q0,01) = 6(8(go, 9), 1) 
=6 (1 ’ 1) 
= 90 
@ 5(qo,010) = 5(6(qgo, 01), 0) 
= 6 (q 0; 0) 
= 41 
Since after scanning the entire string, we did not reach the final state g2, hence 
the string 010 is rejected by DFA ‘M”’. 


iii. Using vdash function 
(qo,010) = + M(qi, 10) 


+ M(qo, 9) 


F M(q1,e). 
Since we reached the end of input (e) and we are not in final state g2, hence 010 
is rejected by M. Thus 010 is not in L(M). 


EXAMPLE 4.3.2: To design a DFA that accepts a set of even number of a’s. 


Solution: We are required to design a DFA, M for the regular expression r = (aa)*. In 
other words, DFA should be designed to accept the language of r, 


i.e., L(M) = {aa, aaaa, aaaaaa,....} 
Consider: © = {a}, Q= {qo,q1}, o = initial state, g; = final state and 6 is given by: 
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Transition diagram: 


Figure 4.9. State Transition to represent L(M) = {W € X*|W is even} 


Transition Table: 


Kes cer ce 
Pele 


Table 4.3 State Transition Table 


DFA action for the input string 


i. To show that the string aaaa is accepted by DFA: 
Consider, 
. (qo, aaaa) = + M(q, aaa) 


F M(qo, aa) 

+ M(q1,a) 

k M(qo, e) 
Since we reached end of input (e) and we are in the final state go, hence aaaa is 
accepted by M. Thus aaaa is in L(M). 

ii. To show that the string aaaaa is rejected by DFA: 
Consider, 
(qo, aaaaa) = + M(q), aaaa) 

F M(qo, aaa) 

 M(q1,aa) 

+ M(qo.a) 

 M(q.e). 


Since we reached the end of input and we are not in the final state go, hence aaaaa 
is not accepted by M. 
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EXAMPLE 4.3.3: To design DFA that accepts odd number of 1’s. 


Solution: We are required to design a DFA, M for the regular expression r = (11)*1 or 
r = 1(11)*. In other words, DFA should be designed to accept the language of r, 


ie., L(M) = {1,111,11111,..... }. 


Consider, = {1}, Q = {qo, 41.92},  9o = initial state, g; = final state and 4 is 
given by: 
Transition diagram: 


1 


Figure 4.10. Transition Diagram for L(M) = {W € X*|Wis Odd Number of 1’s} 


Transition Table: 
resent Peoen ings 


Table 4.4 State Transition Table 


DFA action for the input string 
i. To show that 11111 is accepted by DFA: 


PS GF FH 


Figure 4.11. State Sequence Diagram for 11111 


input: 


state: q 
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Since we encounter end of input and we are in the final state, we say that the string 
is accepted by M. Thus, 11111 is in L(M). 
ii. To show that the string 1111 is rejected by DFA: 


LN\ASNAN/™. 


Figure 4.12. State Sequence Diagram for 1111 


input: 


state: 


Since we encounter end of input and we are not in the final state, we say that string 
is rejected by M. Thus 1111 is not in L(M). 


EXAMPLE 4.3.4: To design a DFA that 


@ starts with 0 and has odd number of Os 
@ starts with 1 and has even number of Is. 


Solution: We are required to design a DFA, M for the regular expression 
r=(0+1)-(0+1)-0+1))*. 
ie L(M) = {01100, 110011,...... a 
Consider, x = {0,1}, @= {90,491.92}, Go = initial state, gq) = final state and 
6 is given by: 


Transition diagram: 
0,1 
S © 
Oe ee 


1 
Figure 4.13. Transition Diagram for L(M) = Ww | 


W starts with 0 and has odd O's 
W starts with 1 and has even 1's 
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Transition Table: 


Table 4.5 State Transition Table 


DFA action for the input string 
i. To show that 110 011 is accepted by DFA: 
Consider, (go, 110011) Fk M(q2, 10011) 
+ M(q1, 0011) 
t M(q2,011) 
F M(q;, 11) 
F M(q2, 1) 
F M(q1,e) 


Since we reached the end of input (e) and we are in the final state gq, hence 110011 
is accepted by M. 


ii. To show that the string 011000 is rejected by DFA: 


Consider, (go, 011000) K M(qu, 11000) 
+ M(qp, 1000) 
+ M(q1, 000) 
F M(q2, 00) 
F M(q1,90) 
F M(q2,e) 


Since we reached the end of input (e) and not in the final state g;, hence 011000 is 
rejected by L(M). 


EXAMPLE 4.3.5: To design a DFA to accept even number of a’s and b’s. 
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Solution: This is to design a DFA, M for the regular expression r = (aa)*(bb)* 
i.e L(M) = {aa, bb, aabb, aaaabbbb, aaaabb, ... .} 


Consider, & = {a,b}, Q = {qo,91,92,93} Go = initial and final states and 4 is given 
by: 


Transition diagram: 


: W has even 
eae . a, * 
Figure 4.14. Transition Diagram for L(M) = {w eZ length of a’s and b’s 
Transition Table: 
fp aSlissET 
Table 4.6 State Transition Table 
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DFA action for the input string 
i. To show that aabb is accepted by DFA: 


S\N SNS NS ™. 


Figure 4.15. State Sequence Diagram for aabb 


input: 
state: q 


Since we encounter the end of input and we are in the final state, we say that the 
string is accepted by M. 


ii. To show that aabba is rejected by DFA: 


/\S\/S\/S\/™. 


Figure 4.16. State Sequence Diagram for aabba 


input: 
state: 


Since we encounter the end of input and we are not in the final state we say that the 
string is not accepted by M hence aabba is not in L(M). 


EXAMPLE 4.3.6: To design a DFA to accept odd number of a’s and, followed by b’s. 


Solution: This is to design a DFA, M for the regular expression r = (aa)*a.(bb)*b 
i.e L(M) = {ab, aaabbb, abbbbb, ababab, aabbab, ....}. 
Consider, = {a,b}, Q= {qo.41.92,93} Go = initial state, q3 = final state and 6 


is given by: 
Ne Peseent inpote_ 
| ai 


Transition Table: 


Table 4.7 State Transition Table 
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Transition diagram: 


Figure 4.17. Transition Diagram for L(M) = {W &€ &*|W has odd length of a’s & b’s} 
DFA action for the input string 
i. To show that ababab is accepted by DFA: 
Consider, (go, ababab) + M(q, babab) 

+ M(q3, abab) 
+ M(q2, bab) 
t+ M(qo, ab) 
F M(qi,5) 
F M(q3,e) 

Since we reached the end of input(e), and we are in the finalstate, hence ababab is accepted 


byM. 
ii. To show that bbabb is rejected by DFA: 


Consider, (qo, bbabb) + (q2, babb) 


+ M(qo, abb) 

+ M(q1, bb) 

+ M(q3,b) 

+ M(q1,e) 
Since we reached the end of input (e) and not in the final state, hence bbabb is rejected by 
DFA. 
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EXAMPLE 4.3.7: To design a DFA to accept odd number of a’s and even number of b’s 


Solution: This is to design a DFA, M for the regular expression r = (aa)*a.(bb)*. 

i.e L(M) = {aaa, aaabb, a, abb, abbbb, aaabbbb, . . .} 
Consider, & = {a,b}, Q = {go. 41, 92,93); Jo = initial state, g, = final state and 6 is given 
by: 


Transition diagram: 


W has odd no of a’s 
and even no of b’s 


Figure 4.18. Transition Diagram for L(M) = {w € &* 


Transition Table: 


| a 


Table 4.8 State Transition Table 
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DFA action for the input string 
i. To show that ababa is accepted by DFA: 


/\S\S\/S\N/™. 


Figure 4.19. State Sequence Diagram for ababa 


input: 


state: 


Since we encounter the end of input and are in the final state, we say that the string 
is accepted by M. 


ii. To show that ababaa is rejected by DFA: 


tf sf Ff <7? 


Figure 4.20. State Sequence Diagram for ababaaa 


input: 


state: 


Since we encounter the end of input and not in the final state, we say that the string 
is rejected by M. 


EXAMPLE 4.3.8: To design a DFA that contains set of all strings ending with 3 consecutive 
zeros, over the alphabet {0, 1}. 


Transition diagram: 


Figure 4.21. Transition Diagram for L(M) = {W € &*{W is String Ending with 3 
Consecutive Zeros} 
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Solution: This is to design a DFA, M for the regular expression r = (0 + 1)*000 or 
r = 1*(01)*(001)*(0001)*.000 
i.e L(M) = {000, 1000, 11000, ... .} 


Consider, © = {0,1}, OQ = {90,91,92.93}, 90 = initial state, g3 = final state and 4 
is shown in figure 4.21. and Table 4.9. 


Transition Table: 


Table 4.9 State Transition Table 


DFA action for the input string 
To show that 01011000 is accept by DFA: 


Consider, (qo, 01011000) F M(qi, 1011000) 
+ M(qo, 011000) 
t+ M(qy, 11000) 
+ M(qo, 1000) 
+ M(qo, 000) 
- M(q1,00) 
F M(q2,0) 
- M(q3,e). 


Since we reached the end of input(e) and are in the final state q3, hence 01011000 is accepted 
by M. 


EXAMPLE 4.3.9: To design a DFA, to accept the language L = {awa|W € (a + b)*} over 
alphabets & = {a, b}. 
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Solution: This is to design a DFA, M for the regular expression r = a(a + b)*a or 
r = ab*(ab)*a* i.e. a machine M to accept all strings starting with ‘a’, followed by any 
number of a’s and b’s and ending at a 


i.e., L(M) = {aa, aba, aaaa, abbbaaa,....}. 


Consider, = {a,b}, @Q= {qo.41,92,.93} 4o = initial state, gz = final state and 
6 is given by: 


Transition diagram: 


a,b 
Dead/trap state 


Figure 4.22. Transition Diagram for L(M) = {W € &*|W € a(a+b)*a} 


Transition Table: 
| a ~=—hrdb FT 
Table 4.10 State Transition Table 
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DFA action for the input string 
i. To show that ababa is accepted by DFA: 


Consider (qo, ababa) + M(q,, baba) 
+ M(q1, aba) 
+ M(q2, ba) 
F M(q1,4) 
F M(qi,e). 
Since we reached the end of input (e) and are in the final state, hence ababa is 
accepted. 


ii. To show that ababab is rejected by DFA: 


Consider (go, ababab) + M(q, babab) 
t+ M(q, abab) 
+ M(q2, bab) 
t+ M(qi, ab) 
F M(q2,b) 
t M(q1,e). 


Since we reached the end of input and are not in the final state, hence ababab is 
rejected by ‘M’. 


EXAMPLE 4.3.10: To design a DFA to accept the set of all strings of a and b starting with 
the string ab. 


Solution: This is to design DFA, M for the regular expression r = ab (a + b)* 
i.e., L(M) = {ab, abaa, abaaa, abbbb,....}. 


Consider, & = {a,b}, Q= {q0.91,92,93}, 90 = initial state, gz = final state and 
6 is given as follows: 
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Transition diagram: 


a, b 
is) S 
ne al nL oll rr al 
S Dead/trap state 
a,b 


Figure 4.23. Transition Diagram for L(M) = {W € &*|W start with ab} 


Transition Table: 


Present Inputs 


Table 4.11 State Transition Table 


DFA action for the input string 
i. To show that the string ababa is accepted by DFA: 


- Fi iF 


state: 


Figure 4.24. State Sequence Diagram for ababa 
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Since we encounter the end of input and we are in the final state, we say that the 
string ababa is accepted by DFA. 


ii. To show that the string aaab is rejected by DFA. 


~ ZNANAN SAS, 


state: 


Figure 4.25. State Sequence Diagram for aaab 


Since we encounter the end of input and q3 is not the final state, we say that the string 
aaab is rejected by DFA. 


EXAMPLE 4.3.11: To design a DFA that contains strings of zeros and ones, with equal 
number of zeros and ones. No prefix of string should contain two more zeros than ones or 
two more ones than zeros. 


Solution: This is to design a DFA, M for L(M) = {010101,011001,....}. 
Consider, = {0,1}, Q= {A,B,C}, qo =A, final state F = {A} and 6 is given by: 


Transition diagram: 


HQ 


Figure 4.26. Transition diagram for L(M) = {W € X*|W contains equal no’s of 0's and 
I's} 
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Transition Table: 


Cc 


SS satis 
ees ae 
vA 
B 
C 


Table 4.12 State Transition Table 


DFA action for the input string 
i. To show that 011001 is accepted by DFA: 


Consider (A, 011001) F M(B, 11001) 
t+ M(A, 1001) 
- M(C,001) 
t M(A,01) 
t M(B, 1) 
kt M(A,e). 
Since we reached the end of input (e) and are in final state, hence 011001 is accepted. 
ii. To show that 0100 is rejected by DFA: 
Consider (A, 0100) F M(B, 100) 

+ M(A,00) 

+ M(B,0) 

+ M(No transition). 


Since there is no transition at! M(B, 0) and we are not in final state, hence 0100 is 
rejected by DFA. 


EXAMPLE 4.3.12: To design a DFA, over alphabet © = {0, 1}, that contains set of strings 
of 0’s or 1’s except those containing substring 110. 
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Solution: We need to design DFA, M for the regular expression r = 0* + 0*(10)* + 
0*(10)*11* 

ie., L(M) = {00, 11,0101,00010010011,...... he 
Consider, & = {0,1}, OQ = {g0,91.92,93} go = initial state, F = (go, q1, 92} as final 
states and 6 is given by: 


Transition Diagram: 


dead/Trap state 


ed 
(+) 
es. wane 


Figure 4.27. Transition Diagram L(M) = {W € & «|W doesn’t contain substrings 110} 


Transition Table: 


ee ee 


Table 4.13 State Transition Table 
DFA action for the input string 
To show that 0101101 is rejected by DFA: 
Consider, (qo, 0101101) K M(qo, 101101) 

t M(q;,01101) 
+ M(qo, 1101) 
+ M(q;, 101) 
F M(q2,01) 
+ M(qs, 1) 
+ M(q3, e). 
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Since we reached the end of input (e) and are not in any of the final states (go, gi, ¢2), thus 
0101101 is rejected by M. 


EXAMPLE 4.3.13: To design a DFA over alphabet £ = {0, 1}, that contains set of strings 
of 0’s and 1’s except those containing substring 001. 


Solution: This is to design DFA, M for the language, such that 
L(M) = {00, 11,0101, 1000000, ..... }. 


Consider, & = {0,1}, Q = {q0,91.92,93}, 90 = initial state, F = (qo, 91,92}, as 
final states and 4 is given by: 


Transition Diagram: 


dead/trap state 


0,1 * 


Figure 4.28. Transition Diagram for L(M) = {W € &*|W does'nt contain substrings 


001}. 
Transition Table: 

Table 4.14 State Transition Table 
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DFA action for the input string 
To show that 100100 is rejected by DFA: 


Consider (go, 100100) F M(qgo, 00100) 
- M(q1, 0100) 
+ M(q2, 100) 
+ M(q3,00) 
F M(q3,0) 
+ M(q3, e). 


Since we reached the end of input (e) and we are not in the final state, hence the string 
100100 is rejected by DFA ‘M’. 


EXAMPLE 4.3.14: To design a DFA over alphabet £ = {0, 1}, that contains set of strings 
of 0’s and 1’s which contain substring 


Solution: We are to design DFA, M for the regular expression r = (0 + 1)*0101(0 + 1)* 
ie., L(M) = {00101, 10101, 11010100,..... }. 
Consider, & = {0,1}, Q = {q0,91,92,93,q4}, initial state = qo, final state = {qq} 


and 6 is given by: 


Transition Diagram: 


Figure 4.29. Transition Diagram for L(M) = {W € X*|W is (0 + 1)*0101(0 + 1)*} 
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Transition Table: 


~ Present Inputs 
One 


Table 4.15 State Transition Table 


DFA action for the input string 
To show that string 010010101 is accepted by DFA: 


Consider, (qo, 010010101) F M(qi, 10010101) 

t+ M(q2, 0010101) 
+ M(q3,010101) 
+ M(q3, 10101) 
+ M(q4, 0101) 
+ M(qa, 101) 
F M(q4,01) 
t M(qa, 1) 
F M(qa,e) 

Since we reached the end of input (e) and are in the final state, hence the string 010010101 

is accepted by DFA ‘M’. 

_ EXAMPLE 4.3.15: Construct a DFA that models the ATM 


Solution: Consider a customer who inserts his bank card into the ATM. It requests him to 
input his identification number (ID) and assumes that the ID is 234 (made up of 3 digits). 
The DFA to model this system needs the following states: 
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go — Initial state, waiting for the first digit in the ID. 

qi — If the first digit is correct; now waiting for the second digit. 
q2 — If the second digit is correct; now waiting for the third digit. 
q3 — If the third digit is correct; step the process and ID is valid. 
q4 — Trap states that captures all invalid ID’s. 


Thus, the DFA has © = {0- -9}, Q = {go, 41.92, 93,94}, 90 = go and F = q3. 


The transition diagram of DFA is shown in figure 4.30. 
9 
7) 5 
a 
Dy o> 5 


dead/trap state 


Figure 4.30. DFA that models the ATM. 


EXAMPLE 4.3.16: Let L = {a‘b/|i > 0,j > 0} be a language over © = {a,b} Show that 
there exists a DFA D = (Q, ¥, go, 5, F) such that L(D) = L. 


Solution: Let Q = {qo0,41,.92},& = {a,b} and F = {go,q1}. The transitions of D are 
described in the transition diagram shown in figure 4.31. 


a b a,b 


(. (VY ON 
as ©) wate ©) eo 


Figure 4.31. DFA to accept L = {a'b|i > 0,j > 0) 


It is clear from the transition diagram that L(D) = {a‘b/|i > 0,j > 0. 


EXAMPLE 4.3.17: Construct a DFA to accept the set of all strings of the form 07”11, 
n> 0. 
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Solution: LetM = (Q, %, qo, 5, F) be the DFA, such that Q = {qo, 91, 92, 93, 94, 95,96}, & = 
{0, 1}, ¢0 = go and F = {qq}. The transition 6 is shown in figure 4.32. 


Figure 4.32. DFA to accept {O7"11,n > 0.} 


EXAMPLE 4.3.18: Construct an automaton to accept only the word € over the alphabet 
X= {a, b}, given Q = {qo. qi}. 


Solution: Let M = (Q, 2,5, qo, F) bet the DFA, such that Q = {qo,q1}, & = {a,b},qo = 
qo and F = {qo}. The transition 6, for the DFA to accept only the word ¢€, is shown in 
figure 4.33. 


Dead/trap state 
a,b 


Figure 4.33. Transition Diagram for L(M) = {€}, r =e 


EXAMPLE 4.3.19: Construct an automaton to accept a string, starting with a followed by 
any number of b’s and ending with a, or a string starting with b followed by any number of 
a’s and ending with b, over the alphabet © = {a, b}. 
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Solution: Let M = (Q, 2,4, qo, F) be the DFA such that Q = {1,2,3,4}, & = {a,b}, qo = 
1 and F = {4}. The transition 5, for the DFA, is shown in figure 4.34. 


b 


a 
Qe 


a 


Figure 4.34. Transition Diagram for 


W start and end witha or 


_ * 
LM) = {w eu" W start and end withb 


|r = (a+ b)(a+b)*(a+b). 


EXAMPLE 4.3.20: Construct a finite automaton to accept those strings of binary numbers 
that are divisible by three, over alphabet & = {0, 1}. 


Solution: Let M = (Q, , 4, qo, F) be the DFA such that Q = (qo, q1, q2}, & = {0, 1},go = 
qo and F = {qo}. The transition ‘5’ to accept those strings of binary numbers, that are 
divisible by 3, is shown in figure 4.35. 


EXAMPLE 4.3.21: Construct a DFA to accept strings of 0’s, 1’s and 2’s, beginning with a 
‘0’ followed by odd number of 1’s and ending with a ‘2’. 


Solution: Let M = (Q,2,5,qo,F) be the DFA such that Q = {q0, 91, 92, 93,94}, & = 
{0, 1,2}, ¢0 = go and F = {q3}. The transition ‘5’ for the DFA is shown in figure 4.36. 
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Figure 4.35. Transition Diagram for 
L(M) = {11, 110, 1001,....} or L(M) = {W € &*|W is divisible by 3} 


(+) , (+) j ©) 
TS oid ae 
02 : 
1,2 0,1,2 


0,1,2 
Figure 4.36. Transition Diagram for L(M) = {W|W e€ 0(11)*12} 


4.4 Nondeterministic Finite Automata (NFA) 


In this section, the concept of nondeterminism is discussed, which has had a great impact 
on the theory of computation. In the foregoing sections, it is observed, that every step 
of computation proceeds in a unique way from the preceding step. In other words, 
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when a machine (in a given state) reads the next input symbol, then the next state is 
uniquely determined. This is called as deterministic computation. However, in the case 
of a nondeterministic machine, multiple choices may exist for the next state at any point. 
Thus, every deterministic finite automaton (DFA) is automatically a nondeterministic finite 
automaton (NFA). 


4.4.1 Illustration 


Consider a finite automaton (DFA), to accept the strings ending in 01 or the string 01, 
represented in the transition diagram shown below: 


Figure 4.37. DFA that accepts Strings ending in O01 


Table 4.16 Transition Table 


| Present Inputs 


With respect to this DFA, following points are noted: 


a. On input 0 in state go, the next state is q; and there is no next state on input 1. 
b. There is no next state on input 0 in state q1. 
c. There is no next state on input 0 and 1, in state qo. 


If the following inputs are supplied, 


a. : start 0 1 
input = 01 =_—=> —> 1 — > q, ——> 


i.e. there exists a unique path to reach from qo to q2 for input 01. 


start 
b. input = 10 — Se Oy a tay (struck or dies) 


There is no path to reach from qo to q2 for the input 10. 
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Here, the DFA’s have exactly one target state for a given input and there exists a unique 
path from initial to final states. 


Now, let us modify the above DFA so that it accepts all strings ending in 01. The transition 
diagram for this automaton is given in figure 4.38. 


—(«)—+ (a )—+- @) 
On 


Figure 4.38. DFA that accepts Strings ending in O1 


Table 4.17 Transition Table 
Present Inputs 
0 1 
{qo,91} {go} 
@ {q2} 
@ ) 


With respect to this modified FA, the following points are noted: 


a. On input 0 in state go, the next state may be either of the two states viz., go or q1. 
b. There is no next state on input 0 in state q). 


c. There is no next state on input 0 and 1, in the state q2. 


The processing of input string 00101 by this FA is shown in figure 4.39. 


0 0 1 0 1 


—> Jo —— >» 40 —~» 4 — >» 4 —~ 40 — > 40 


YON \ 


q1 ql qi 
(Stuck) \! \\ 
or 
(dies) 92 
(Stuck) 


Figure 4.39. Processing of FA for the input string 00101. 
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Here, for the first input symbol 0 of the string, there exists two choices — whether 
to stay at the same state go or move to the next state q;, with the initial state of qo. 
This means that any input symbol does not result in the transition to a unique state, but 
results in a chain of states. Thus, the above modified FA has multiple paths corresponding 
to the input 00101. In order to decide, whether or not 00101 is accepted by this machine, 
there should be atleast one path that starts from the initial state and ends at the final state. 


Path-1; 
0 0 1 0 1 
40 —> Go —? 90 — 90 — 40 — 40 
=> stuck at go itself. 
Path-2: 
0 
90— 71 
=> stuck at q 
Path-3: 
0 0 1 
90 — 9 — W1—7 Q 
=> stuck at q2 
Path-4; 


Gp? ee $24 +. 2.4 1, (@) 


=> starts at the initial state and ends at the final state. Thus, the string is accepted by the 
machine in this path. 


This machine has thus allowed several states as a result of the processing of an input symbol, 
this is called nondeterminism. If from any state we can reach several states or a null state, 
then the finite automaton becomes nondeterministic in nature. 
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Definition 


If the basic finite automata model is modified in such a way, that from a state on an 
input symbol; One or more transitions/choices are permitted, then the corresponding finite 
automata is called a “non-deterministic finite automata” (NFA). 


4.4.2 Elements of NFA 


A non-deterministic finite automaton exhibits the following five characteristics: 


Finite set of states (Q) 
An alphabet & of possible input symbols. 


c. The transition function (6), which specifies the nature of transition at a given state 
due to the input symbols. 


d. The initial state qo, 
e. The set of final states (F). 


Note-3: 

The term nondeterministic refers to the fact that for each input, 
there can be several states to which the automaton can make a 
transition from its current state. 


4.4.3 How NFA Operates 


NFA corresponds to a kind of parallelism in automata. It consists of the same basic model 
or components as DFA i.e., input tape, read head (single) and finite control (internal state). 
However, when transition function allows more than one next state, for a given state and 
input, it is required to keep an independent internal state for each of the alternatives. In other 
words, there must be a constantly growing and shrinking set of automata, processing the 
same input synchronously. 

To be very precise, nondeterminism may be viewed as a type of parallel computation 
in which several processes are running concurrently. If atleast one of these processes 
accepts, then the entire computation automatically accepts. The demonstration of operation 
of nondeterministic finite automata is shown in figure 4.40 
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Nondeterministic Finite Acceptor (NFA) | Nondeterministic Finite Acceptor (NFA) 
Alphabet = {a} Alphabet = {a} 


Two choices Two choices 


First Choice First Choice 


No transition: 
the automaton hangs 


Figure 4.40. P-Y: Demonstration of Nondeterministic Finite Automata 
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4.4.4 Ordered Quintuple Specification 


Formally, a nondeterministic finite automaton M is a five-tuple: 


M = (Q,%,4,q0, F) 
where, 


Q is a finite set of states 
= is a set of input symbols. 

. 61s a transition function i.e., 5 :Qx XZ — P(Q) where P(Q) denotes the power set 
of Q 

. qo isa specifically designed initial state. 

. F C Qisaset of final states. 


In NFA, the transition function 6 takes a state and an input symbol (or the empty string) and 
produces the set of possible next states. 


4.4.5 Description of an NFA 
Transition of NFA can be represented using a transition table or diagram. 


Transition table of NFA has rows corresponding to states, which are represented within 
the braces { } and columns corresponding to the inputs. The entries correspond to next states 
to indicate the transition of NFA. 


Transition diagram of NFA says that, if 5(g,a) = {q1,...,qx}, then there will be an arc 
labelled ‘a’ from q to each of q1,q2,..-.,qk. This means that if there is a path labelled ‘b’ 
leading from some qo to q, then there are paths labelled ‘ba’ from qo to each of qj,... qk. 
Suppose, 5(q,a) = ¢, then there is no arc labelled ‘a’ from state q in transition diagram and 
no path labelled ‘ba’ that visits g in its next-to-last state. In the mathematical sense, this is 
a shorthand notation. 


EXAMPLE 4.4.1: (Describing NFA) 
Initial state =O, Final state = {0,2}, Q = {0,1,2,3}, XX = {a,b} and transition 6 of NFA 
is given by: 
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Transition table: 
x | Present Inputs 
Q a b 
0 {1,3} 0) 
1 {2,3} {0,1,2,3} 
2 {3} {9} 
3 {3} {0, 2} 


Table 4.18 Transition table of NFA. 


Transition Diagram: 


Figure 4.41. Transition Diagram of NFA. 


4.4.6 Extended transition function for NFA 


Since in the case of NFA, there exist more than one path corresponding to a given input x 
in &*, hence it is required to test the multiple paths corresponding to x in order to decide 
whether x is accepted by NFA or not. This is because, for NFA to accept x, atleast one path 
corresponding to x is required in NFA. This path should start in the initial state and end in 
final states. Whereas in case of DFA, since there exists exactly one path corresponding to x 
in &*, it is enough to test whether or not that path starts in the initial state and ends in one 
of the final states, in order to decide whether x is accepted by a DFA or not. 


119 


Downloaded from https://www.cambridge.org/core. Stockholm University Library, on 06 Dec 2018 at 07:58:37, subject to the Cambridge Core terms of use, available at 
https://www.cambridge.org/core/terms. https://doi.org/10.1017/UPO9788175968363.005 


A Textbook on Automata Theory 
Definition of 5: For a state g and a string x, 5(q, x) is the set of states that NFA can reach 
when it reads the string x, starting at the state g. In general, NFA nondeterministically goes 
through a number of states from the state q as it reads the symbols in the string x. Thus for 
an NFA, M = (Q, Z,qo,6,F), the function ‘5’ is extended as 
6:Qx x* > P(Q) 


and is defined recursively as follows: 


5(q, €) = {q} 


This means that an NFA stays in the same state qg when it reads an empty string at 
state q. 


a. For any state g of Q, 


b. For any state g of Q and any string x € &*, with ‘a’ as the last symbol of x and 
aeé &, then 


5(q, xa) = Ufor every p in q 5(p,a) or 5(q,xa) = Upes(q,x) 5(p,a) 


This means that the set of states that can be reached by NFA after reading string xa starting 
at state q, is the set of states it can reach by reading the symbol a after reading the string x, 
starting at state q. 


EXAMPLE 4.4.2: (Language acceptance of strings by NFA) 
Consider the automata of figure 4.38 with input w = 00101. 


eq 
ES be ere 


1 
4 
q, 


Table 4.19 Output table 
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5(go, €) = o 


5(go,0) = {40,91}. 


5(qo, 00) = 5(6 (go, 0), 0) 

= 5({g0, 91}, 0) 

= 6(g0,0) U 6(q1, 0) 
{q0, q1} U {9} 
{qo, 91}- 


5(qgo, 001) = 6(5(go, 00), 1) 
= 5({go, 41}, 1) 
= 6(go, 1) U8(q1, 1) 
= {go} U {q2} 
= {90,92}. 


5(qo,0010) = 5(6(go, 001), 0) 
= 5({q0, 92}, 0) 
= 8(go, 0) U8(q2, 0) 
= {90,91} U {6} 

= {qo, 71}. 


m 5(go, 00101) = 5(8(go, 0010), 1) 
= 5({qo, a }, 1) 
= 8(go, 1) U8(q1, 1) 
= {90} U {a2} 
= {90,92}. 


4.4.7 Language accepted by NFA 
The language accepted by NFA, M for a string x is defined as, 


L(M) = {x|6(qo,x) = p, where p contains atleast one member of F} 


or 
L(M) = {x|6(g0,x)OF # {9}}. 


This means that the NFA accepts the string x, iff it can reach an accepting state by reading 
x starting at the initial state. 


121 


Downloaded from https://www.cambridge.org/core. Stockholm University Library, on 06 Dec 2018 at 07:58:37, subject to the Cambridge Core terms of use, available at 
https://www.cambridge.org/core/terms. https://doi.org/10.1017/UPO09788175968363.005 


A Textbook on Automata Theory 


EXAMPLE 4.4.3: (Language accepted by NFA) 
For the automata of figure 4.38 with W = 00101, we have 


5(go, 00101) = {go,q2}, where P = {qo, q2}. 


Since P contains g2 which is the final state, here 00101 is accepted. 
In other words, 


5(go, 00101) = {90, q2} NF 
= {40,92} {q2} 
= {92} 
# {9} 


=> 00101 is accepted by the NFA. 


4.4.8 Important Illustration on NFA 


a. NFA is used to avoid the redundant states and transitions in DFA, which makes the 
modelling easier. 


EXAMPLE 4.4.4: DFA with (ab + aba)* or L = (ab U aba)* is given in figure 4.42: 


a 


“_@)— —1@ © 
oS — ———_——_——> 
ea: ene 


(«) a,b 


Figure 4.42. State Transition Diagram for the DFA. 
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To avoid redundant states and transitions, and to make the modelling easier, we use the 
NFA. The two possible NFAs for the above DFA are: 


(a) (b) 
Figure 4.43. Simplified Version of figure 4.42 


b. NFAs do not necessarily go to a unique next state. An NFA may not go to any state 
from the current state on reading an input symbol or it may select one of the several 
states nondeterministically (e.g.: by throwing a die) as its next state. In other words, 
NFA with input may go to one or two or ‘n’ or none states. 


EXAMPLE 4.4.5: To show that an NFA selects one of the several states nondeterministically. 


TOE OT) 


a,b a e 


==) 


Figure 4.44. State Transition Diagram for the NFA. 
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i. input = babbbab. 


(qo, babbbab) + (qo, abbbab) 
k (qo, bbbab) 
F (qo, bbab) 
F (qo, bab) 
F (qo, ab) 
= (qo, 4) 
F qo 
i.e., the above NFA for the input ‘babbbab’ has not moved to the next state. 
ii. input = babbbab. 
5(qo, babbbab) + 5(qo, abbbab) 
+ 8(qo, bbbab) 
+ 8(q1, bbab) 
+ 8(q2, bab) 
t+ 5(q4, bab) 
F 8(q4, ab) 
F 5(q4,b) 
F 6(q4, €) 
i.e., the above NFA for the input ‘babbbab’ has moved to three states from the initial 


state qo. 
iit input = babbbab. 


=x | Present Inputs 

Q a b 

qo {go} {40,91} 

N {q3} — {q2} 

qQ2 e e 

B p {qa} 

q4 {qa} {qa} 
Table 4.20 Transition Table for the NFA in Figure. 4.44. 
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5(qgo,b) = {0,71}. 
5 (go, ba) = 5(5(q0, 4), a) 
= 6(go, a) US(q1, 4) 
= {40,93}. 
| 5(qgo, bab) = 5(8(qo, ba), b) 
= 5(qgo, b) U 5(q3,b) 
= {qo,q1} U {qa} 


= {40.91.94}. 

a 5(qo, babb) = 5(5(qo, bab), b) With babb, the NFA has 4 
= 5(qo,b) US(qi,b) US(g4,b) | possible states, go, 91, 92. 44- 
= b bb 

{qo, 41} U {q2} U {qa} G0 > G0 > Go 91 > 92> 44 

= {90.91,92,94}- 

a 5(qo, babbb) = 5(5(qo, babb), b) 
= (go, b) U8(q1, b) U 8(q2, b) U 8(qq4, b) 
= {0,91, 92,94}. 

@ 45(go0, babbba) = 5(5(qo, babbb), a) With babbba the, 
= 6(qo,a) US(q1, a) U 5(q2, a) U8 (q4, a) NFA has 3 states: 
= {qo} U {93} U {e} U {qa} 40: 93: 94- 
= {q0, 93,94}. 

m 5(q0, babbbab) = 8(5 (qo, babbba), b) With babbbab the NFA has 3 
= 5(qo,b) US(q3,b) US(q4,b) — States qo, qi and qq, i.e. with go 
= {qo.91,94} upto symbols bab: it stays in go, 


then with b and (babbb) it goes 
to q; and from q; with ab it goes 
to q4. 


Thus, the above NFA with input babbbab moves to all the states from the initial state. 

c. Inan NFA, foreach state q of Q, and for each symbol a of X, 5(q, a) must be specified. 
However, it can be an empty set, in which case the NFA aborts its operation. 

d. As in the case of DFA, the accepting states are used to distinguish the sequences of 
inputs given to the finite automaton. If the finite automaton is in an accepting state 
when the input ends, then the sequence of input symbols given to the finite automaton 
is ‘accepted’. Otherwise, it is not accepted. 

e. Note that any DFA is also an NFA. 

f. ADFA is a special case of NFA in which the possibility of nondeterminism is not 
exploited. 
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4.5 Design of NFAs 


The basic design strategy for an NFA is as follows: 


Understand the language properties for which the NFA has to be designed. 
Determine the alphabet and state set required. 

Identify the initial, accepting and dead states of NFA. 

Obtain the transitions to be made for each state on each character of the input string. 
Draw the transition table and diagram for NFA. 

Test the NFA obtained on short strings. 


mono f 


EXAMPLE 4.5.1: Design an NFA that accepts set of all strings over {0, 1} that have at least 
two consecutive 0’s or 1’s. 


Solution: We are required to design a NFA for the regular expression r = (0 + 1)*(00 + 
11)(0 + 1)*. In other words, NFA, M, should be designed to accept the language of r. 


ie., L(M) = {00, 11,0110, 1001, 10110101,...}. 
where M is the NFA. 


Consider, 
x = {0,1}, Q = {90,91, 92,93, 94}. 
qo = initial state , {q2,q4} = final states and 6 is given by: 


0, 1 
aes 
CY oe 
(a) 
1 
SG) 
i 
0, 1 


Figure 4.45. State Diagram to represent 
L(M) = {w € =*|w contains atleast two consecutive 0’s or 1’s}. 


Transition diagram: 
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Transition Table: 


x | Present Inputs 


Q 0 1 
+40 | {90,93} {40,41} | 
41 i) {92} 
*q2 {92} {q2} 
3 {qa} Cy) 
*q4 {qa} {qa} 
Table 4.21 State Transition Table 


NFA action for the input string 
a. To show that the string 0100011 is accepted by the NFA. 


i. Using extended transition function: 


5(go,€) =o, 45(gGo,0) = {go,q3} and 5(qo, 1) = {go, 41}. 


5(go,01) = 5(5(qgo, 9), 1) 
= 5(go, 1) U8(q3, 1) = {40, g1} U {6} 
= {qo. 91}. 
5(qo, 010) = 6(5(go, 01), 0) 
= {q0, 93}. 
5(go, 0100) = 5(5(go, 010), 0) 
= 5(go, 0) U 8(q3, 0) = {40,93} U {qa} 
= {40.93.94}. 
8(qo, 01000) = 5(3(qo, 0100), 0) 
= 5(go, 0) U 5(q3, 0) U 5(q4, 0) = {G0, 93} U {G4} U {Ga} 
= {q0, 93.94}. 
5(qgo, 010001) = 5(5(go, 01000), 1) 
= 5(go, 1) U8(q3, 1) U 5(g4, 1) = (G0, g1} U {0} U {qa} 
= {go, q1,44}- 
@ 5(go,0100011) = 5(5(go, 010001), 1) 
= 8(go, 1) U8(q1, 1) U 8 (ga, 1) = {90,91} U {92} U {Ga} 
= {90, 91, 92,94}. 
Now, let P = {g0, 91,92, 94}- 
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Since P contains g4 and gz (which are final states), clearly 0100011 is accepted 
by L(M) 
ie. 5(go,0100011) = {q0, 91,92,94} NF 
= (90. 91+92,94} 1 (qo, 44} 
= {92,44} 
= {¢} > 0100011 is accepted. 


ii. Using tree state diagram for input = 010001: 


states inputs 
0 aR ee ce et 0 
q3 hh Sas ee a en ce cic 1 


LoL \oar 
f# £ io | 


Figure 4.46. Tree State Diagram for the String 0100011 


(*)-indicates accepting paths in NFA tree. 
Since there exists a final path, 0100011 is accepted. 
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b. To show that the string 0101 is rejected by the NFA: 
i. Using extended function: 


@ 5(40,€) = 40, 5(G0,9) = {90,93}, 5(go, 1) = {40.41}. 
m@ 5(q0,01) = 6(5(go, 9), 1) 
= 5(go, 1) US(q3, 1) = {90,91} U {6} 
= {go,q1}- 
@ 5(qo,010) = 6(6(go, 01), 1) 
= 6(go, 0) U 8(q1, 0) = {q0, 93} U {9} 
= {q0, 93}. 
@ 5(qo,0101) = 5(8(go, 010), 1) 
= 8(qo, 1) U8(q3, 1) = {G0, 91} U {6} 
= {40.41}. 


Let P = {qo, q1}. Since P contains no final states, hence 0101 is rejected by the 
NFA 


ie. 5(go,0101) = {go, qi} NF 


= {90,91} 1 {92,94} 
= @ => 0101 is rejected by NFA. 
ii. Using tree state diagram for the input 0101: 


states inputs 


q3 SS ee SS SSeS 1 
(dies) va ‘ 
q1 qo 


Figure 4.47. Tree State Diagram for the string 0101 
Since no final paths exist, 0101 is rejected by NFA. 
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EXAMPLE 4.5.2: Design an NFA to accept set of all strings starting with ‘a’ followed by 
‘a’ or ‘b’ and ending with a or any number of b’s. 


Solution: Here it is required to design an NFA for the regular expression r = a*(ab +a+ 
ba)(bb)*. 
In other words, NFA, M should be designed to accept the language of r 
ie., L(M) = {a, ab, aaa, abbbb, ...}. 


Consider, 


xu = {0, 1}, Q = {q0, 91. 92,93}; 
qo = initial state, g3 = final state and 4 is given by: 


Transition diagram: 
a 
() a 
a 
—@) 
Oe a 
Figure 4.48. State Transition Diagram to represent 


_ » | Ww Starts with a and followed 
y= { oe by a or b and ends with a or b 


Transition table: 


x | Present Inputs 

Q a b 
qo {go,91,93} {42} 
71 @ {q3} 

qQ2 {93} oy) 
B 0) {q1} 


Table 4.22 State Transition Table 
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NFA action for the input string 
To show that the string abbbb is accepted by the NFA. 
i. Using extended transition function: 
a 5 (qo,€) = 40, 5 (40,4) = {90.91.93}, 4 (qo,b) = {42} 
a 5 (qo, ab) = 8 (5 (go, a), b) 
= 6 (go, b) U5 (qi, b) US (93, b) = {42} U {93} U {91} 
= {41, 92,93} 
5 5 (go, abb) = 5 (5 (qo, ab), b) 
= 6 (q1,b) US (q2, b) U5 (93, b) = {93} U {0} U {41} 
= {93,41} 
ws 45(qo,abbb) = 6 (6 (qo; abb), b) 
= § (q3,b) US (q1,b) = {91,93} 
gs 5(go,abbbb) = 8 (6 (qo, abbb), b) 
= 6 (q1,b) US (q3, b) = {91,93} 
Let P = {q1,q3}. Since P contains the final state g3, hence abbbb is accepted by 
the NFA. 
ie., 5 (qo, abbbb) = {q\,q3} NF 


= {41,93} N {q3} = {q3} = accepted. 
ii. Using tree state diagram for input abbbb: 


states inputs 

/ fo wee e---- = 

14 &@ Gd «----- b 
y 7 
3 2 SS SS aS 
/ (dies) 
a1 Sees b 
B WH e-e- ee g-Kr-e- eK eK eK b 
q1 


Figure 4.49. Tree State Diagram for the string abbbb 
Since there exists a final path in the tree, hence abbbb is accepted. 
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EXAMPLE 4.5.3: Design an NFA that accepts a set of all strings ending in 00. 


Solution: Here it is required to design an NFA for the regular expression r = (1 +0)*(00). 
In other words, NFA, M should be designed to accept the language of r. 


ie., L(M) = {00, 1100, 100, 000, 111000, . . .}. 
Consider, 


z = {0,1}, Q = {90.91.92}, 
4o = initial state, F = qz is the final state and 6 is given by: 


Transition diagram: 


1 
ae 0 
as ames mag © 
i, e: 


Figure 4.50. State Transition Diagram to represent 
L(M) = {W € &*|W ends with 00}. 


Transition table: 


x | Present Inputs 
Q 0 1 
qo {qi} {ao} 
71 {q2} {qo} 
2 {q2} {?} 
Table 4.23 Transition Table 


NFA action for the input string 


To show that the string 10100 is accepted by NFA: (By using extended Transition 
function) 


m <45(qo,€)=40, 48(g0,0)=4i1 and 4(go,1)=40 
mw 45(Go, 10) = 6 (6 (go, 1), 0) 
= 6 (go,0) = {q1}. 
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a 5 (go, 101) = 5 (4 (go, 10), 1) 
= 6 (41,1) = {qo}. 
w 45(Go, 1010) = 6 (6 (go, 101), 0) 


= 46 (go, 9) 
= {q1}. 


a 65(qo, 10100) = 6 (6 (go, 1010), 0) 
= 6 (41,0) 
= {92}. 


Since q2 is the final state, 10100 is accepted by the NFA. 


ie. 5(go, 10100) = {qo} NF 


= {92} {92} 
= {q2} 4 ¢ hence accepted. 


EXAMPLE 4.5.4: Design an NFA to accept all the strings over the alphabet {a, b} ending in 
aba. 


Solution: We are required to design an NFA for the regular expression, r = (a + b)*aba. 
In other words, NFA, M should be designed to accept the language of r 


i.e., L(M) = {aba, aaaba, ababa, abababa, ...}. 


Consider & = {a,b}, Q = {qo, 91, 92,93}, 90 = initial state, g3 = final state and 6 is given 
by: 


Transition diagram: 


- 


a,b 


Figure 4.51. State Transition Diagram to represent 
L(M) = {W € &*|W ends with aba} 
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Transition table: 


x | Present Inputs 
Q a b 
qo | {90,41} {qo} 
11 p {q2} 
q2| {93} g 
B @ g 


Table 4.24 State Transition Table 


NFA action for the input string 
To show that ababa is accepted by NFA: (Using tree state diagram) 


States inputs 
J 90 we ---------- = 
Wa 0. <r See b 
/ 0. A SS SS Si a 
q3 a1 qf “#---- b 
(*) Va Ke 
Lo fo «+ ---—--—a 
93 qo 
(*) 


Figure 4.52. Tree state diagram for the string ababa. 


Since there exists a path in the tree state diagram, ababa is accepted. 


EXAMPLE 4.5.5: Construct an NFA that accepts the set of all strings over {0, 1} that start 
with 0 or 1 and end with 01 or 10. 


Solution: Here we are required to design an NFA for the regular expression. r = 
(0+1)*(01+10). In other words, the NFA, M should be designed to accept the language of r 


ic., L(M) = {01, 10,0010, 1110,011001,...}. 
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Consider, 


x = {0,1} Q = (90, 91,92, 93} 
qo = initial state, g3 = final state and 6 is given by: 


Transition diagram: 


Figure 4.53. State Transition Diagram to represent 
L(M) = {w € &* w Contains set of all string with 1 or 0 ending in (01 + 10)} 


Transition table: 


L | Present Inputs 
Q 0 1 
> qo | {40,41} {90,92} 
11 @ {q3} 
qQ2 B 7) 
*G3 p @ 


Table 4.25 Transition Table 


NFA action for the input string 
To show that 0110 is accepted by the NFA: (Using extended Transition function) 
m 4(40, €) = Go, 5(go, 9) = {go, 41}, (Go, 1) = {40, 92}. 


mw 4(go,01) = 5(6(go, 9), 1) 
= 5(go, 1) U8(q1, 1) = {40. ¢2} U {93} 
= (90. 92, 93}. 
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mw 4(g0,011) = 5(5(go, 01), 1) 
= 5(go, 1) U 8(q2, 1) U8(q3, 1) = (40. g2} U {6} U {6} 
= {90,92}. 


m 5(go,0110) = 5(5(go, 011), 0) 
= 5(go,0) U 6(q2,0) = (go, g1} U {43} 
= {90.91.93}. 


Since q3 is a final state, 0110 is accepted by the NFA. 
EXAMPLE 4.5.6: Give an NFA which accepts all strings with ab over {a, b}. 


Solution: Let M = (Q, X, qo, 5, F) be the NFA such that Q = {qo, 92, q3}, & = {a,b}, qo = 
qo and F = {q3} the transition 5 of M to accept all strings of the form r = ab(ab)* is shown 


in figure 4.54. 
a,b 
a” a” 


Figure 4.54. Transition Diagram to represent 


= «| Ww Starts with ab and _ : 
L(M) = { ex | followed by ab ,r = ab(ab) 


EXAMPLE 4.5.7: Construct a transition system, which can accept strings over the alphabets 
a,b,...... containing either ‘cat’ or ‘rat’. 


Solution: Let M = (Q,%,q0,5,F) be the NFA such that Q = {q0,q1,q2,4q3), 
= {a- -b},qo = qo and F = {q3}. The transition 6 is shown in figure 4.55. 


a, b, ... 


6) 
Ch SOO -+-© 


a, VU bes 


Figure 4.55. Transition Diagram of NFA. 
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EXAMPLE 4.5.8: Construct an NFA whose language consists of all the strings over 
{0,...,9}, ending in either 1102, 0102, or 1110. 


Solution: The NFA to accept all the strings ending in either 1102, 0102 or 1110 is given 
below: 


Figure 4.56. Transition Diagram of NFA for L(M) = (0 — 9)*(1102 + 0102 + 1110) 


EXAMPLE 4.5.9: Construct an NFA to accept the string a or ab. 


Solution: Let M = (Q,%,q0,5,F) be the NFA such that Q = {0,1,2},2 = {a,b}, 
go = 0 and F = {2}. The transition 5 of M to accept the string a or ab is shown in figure 


we 
5 


Figure 4.57. Transition Diagram, NFA for r = a+ ab 
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EXAMPLE 4.5.10: Construct an NFA that will accept those strings of decimal digits, that 
are divisible by three. 


Solution: LetM = (Q, %, qo, 5, F) be the NFA such that Q = {A, B, C}, X = {0--9},qo =A 
and F = {A}. The transition 5 of M to accept the string of decimal digits that are divisible 
by three is shown in the figure 4.58. 


0/3/6/9 


1/4/7 
aiait 0/3/6/9 
——~ 2/5/8 


2/5/8 : 
va? me) 

A 

we 


0/3/6/9 (<) 


Figure 4.58. Transition Diagram that accepts string of decimals that are divisible by three. 


EXAMPLE 4.5.11: Construct an NFA that accepts the language L = {w: |w| mod 3 = 0} 
on & = {a,b}. 


Solution: LetM = (Q, X, qo, 5, F) be the NFA such that Q = {go, q1, q2}, & = {a,b}, qo = 
qo and F = {qo}. The transition 5 of NFA to accept all strings having a length of mod 3 is 


shown in figure 4.59. 
—@2-O-- 
WS ee 


Figure 4.59. NFA to accept L = {aaa, aaaaaa, aaaaaaaaa, . . .} 
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EXAMPLE 4.5.12: Construct an NFA that accepts the language L = {w : |w| mod 5 4 0} 
on & = {a,b}. 


Solution: Let M = (Q,%,q0,5,F) be the NFA such that Q = {qgo, 91, 92, 93,94}, © = 
{a, b}, go = go and F = {qo, g2, 43, 94}. The transition 5 of NFA to accept all strings of a’s 
and b’s, which are not multiples of 5, is shown in figure 4.60. 


a,b 


Figure 4.60. NFA to Accept words of |w| mod 5 # 0 


EXAMPLE 4.5.13: Consider an NFA given below. Check whether the strings 001, 011101, 
01110, 010 are accepted by the machine or not. 


=o 


0, 1 


Figure 4.61. Transition diagram of NFA 


Solution: The transition function 4 is defined by 


x | Present Inputs 
Q 0 1 
qo {go,91} 0 
1 p q2 
*q2 g d 
Table 4.26 Transition Table 


Given go = initial state and F = {qo}. 
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a. Input = 001 


w 4(40,00) = 5(8(go, 9), 0) 
= 6(go,0) U8(q1,9) = {90,91} U {6} 
= {40,91} 

mw 4(40, 901) = 5(8(go, 00), 1) 
= 5(go, 1) U6(q1, 1) = {40} U {92} 
= {go,q} EF 
=> accepted 


b. Input = 011101 


a 6(go, 01) = 6(5 (go, 9), 1) 
= 5(go, 1) U8(qi, 1) = {qo} U {42} 
= {40,92} 
a 6(go, 011) = 5(6(go, 01), 1) 
= 5(qgo, 1) U 8(q2, 1) = {qo} U {9} 
= {qo} 
6(go, 0111) = 5(8(go, 011), 1) 
= 6(qo, 1) 
= {qo} 
a 8(qgo, 01110) = 5(6(go, 0111), 0) 
= 5(qo, 0) 
= {90,91} 
m —.8(qo, 011101) = 5(8(go, 01110), 1) 
= 8(qgo, 1) U8(q1, 1) = {go} V {42} 
= {90,92} € F 
=> accepted 


c. Input = 01110 


a 5(qo, 01) = 5(5(go, 0), 1) 
= 6(go, 1) U 8(q1, 1) = {90} U {92} 
= (qo, 92} 

w 45(g0,011) = 5(3(go, 01), 1) 
= 5(go, 1) U 8(q2, 1) = {G0} U {} 
= {qo} 

w 45(qgo, 0111) = 6(6(go, 011), 1) 
= 8(qo, 1) 
= {qo} 
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w 4(go, 01110) = 5(S(go, 0111), 0) 
= 5(q0,0) 
={go.n}¢F 
=> not accepted. 


d. Input = 010 

mw 45(g0,91) = 5(8(go, 9), 1) 
= 8(go, 1) US(q1, 1) 
= {qo} U {92} 
= {90.92} 

gw 45(g0,010) = 6(5(go, 01), 0) 
= 5(qo, 0) U 5(q2, 0) 
= {go, 91} U {9} 


= {90.91} ¢F 
=> not accepted. 


EXAMPLE 4.5.14: Which of the following strings are accepted by NFA: 010, 110? 


0 
start @) > @)>' 
‘eo ‘TE 
0 0, 1 


Figure 4.62. Transition Diagram of NFA. 


Solution: 


x | Present Inputs 


Q 0 1 
qo {go.91} {qu} 
11 ) {q0, 91} 


Table 4.27 Transition Table 


Given F = {q,} and go = initial state. 
a. Input = 010 


w 45(go,01) = 5(6(go, 9), 1) 
= 5(go, 1) U8(q1, D) = {41} U {90,41} 
= {q0,1}- 
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m 45(g0,010) = 5(6(go, 01),0 
= 8(go,0) U8(q1, 0) 
= {90.91} U {6} 
= {go,q1} € F 
=> accepted. 
b. Input = 110 


w  4(go, 11) = 8(6(qgo, 1), 1) 
= 6(q1, 1) 
= {90,91} 
wm 5(go, 110) = 5(5(go, 11), 0) 
= 5(g0, 9) U 8(q1, 0) 
= {qo,.91} U {o} 
= {90,91} € F 
=> accepted. 


4.6 Non-Deterministic Automata with ¢-Moves 


4.6.1 Introduction 


In some situations, it would be useful to enhance NFAs by allowing transitions that are 
not triggered by any input symbol; such transitions are called €-transitions. If in a state 
q, an €-transition to a state q’ is possible, then whenever the automaton reaches state g in 
a computation, it can immediately proceed to state g’ without taking any further input. In 
other words, this capability does not expand the class of languages that can be accepted by 
finite automata but gives us some programming convenience. 


EXAMPLE 4.6.1: NFA with €-transitions between go to q; and q) to q2. 


Figure 4.63. NFA with €-Transitions showing that, without any Input Symbol, the NFA 
can be in the final State 
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EXAMPLE 4.6.2: Consider an NFA with ‘e’ as transition from q; to gz as shown in figure 


4.64. 


0, 1 


0,1 


— (#) “+ (@)+-©) 
——» | dg })———> (| q, } ———— [| @® )—— 


Figure 4.64. NFA with €-Transition showing that without any Input Symbol, it can change 


from q, to q2 


For the input string 010110, the processing of the above NFA is as follows: 


inputs 


Figure 4.65. 


Computation of NFA with €-Transitions of fig. 4.64 on Input String 010110 
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Definition 


The NFA that responds to an empty string ‘€’ and moves to the next state is called NFA 
with €-moves. 


Alternatively, a finite automata that is modified to permit transitions without input 
symbols, and with zero, one, or more transitions on input symbols, is an NFA with €-moves 
(NFA-é). 


4.6.2 Elements of NFA-€ 


An NFA with €-moves exhibits the following five characteristics: 


a. Finite set of states (Q). 
b. An alphabet © of possible input symbols. 


c. The transition function (6) specifies the nature of the transition at a given state due 
to the input symbols including e€. 


d. The initial state qo. 
e. The set of final states (F). 


Note—4: 
The term ‘€-moves’ or transition refers to the fact that ‘transition 
takes place without reading any symbols in the input. 


4.6.3 How an NFA-e Operates 


The demonstration of operation of NFA-€ is shown in figure 4.66. 


4.6.4 Ordered Quintuple Specification of NFA-€ 


Formally, an NFA with €-moves is a five-tuple: 
NFA- €= (Q, 2,5, go, F) 
where 


a. Qisa finite set of states 


b. & is a set of input symbols 
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c. 6 isa transition function such that 6: Q x (LU €) > P(Q) 
(which takes care of € and non-é transitions) 


d. qo € Q is the initial state 
e. FC Qisaset of final states. 


NFA with € Transitions 
Alphabet = {a} 


(read head doesn’t move) 


“accept” 


U: String aa is accepted 


Figure 4.66. P-U: Demonstration of Non-deterministic Finite Automata with €-moves 
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4.6.5 Transitions of NFA-€ 


With respect to an NFA-e, there exist two transitions: 


@ €-transitions — Transitions that take place without reading any input symbols are 
called €-transitions. 
If in a state q a transition to a state q’ is possible, then whenever the automaton reaches 
state q in a computation, it can immediately proceed to state g’ without reading any 
further input. 

@ non €-transitions — Transitions that take place with the reading of input symbols 
are called non €-transitions. 


4.6.6 Description of an NFA-€ 


The transitions of NFA-€ can be represented by using transition table and diagram. 


Transition table of NFA-¢ has rows corresponding to states, which are represented within 
braces {}, and columns corresponding to inputs. Also, there exists an additional column, 
which (for each state “q’) gives the set of all states reachable from q by an €-transition. 


Transition diagram: It has two types of labels: 


@ non-c¢-transition labels 

If 5 (q,a) = {q1,..-, qx} then there will be arc labelled ‘a’ from q to each of qj, ... qx. 
m ¢-transition labels 

If 5(g,€) = {q1,...,qx}, then there will be an arc labelled ‘e’ from g to each of 


1>+++>4k- 


EXAMPLE 4.6.3: (Describing NFA-c<) 
Consider, 


Q = {40.91.92}, > = {a,b} 
qo = initial state, F = {q2} 
and the transition 5 of NFA-é is shown in figure 4.67. 


€ 


E 
a aN 
= ; 
——_ ———- ——__> 
€ 


Figure 4.67. Transition Diagram of NFA-€ 
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Present Inputs 
Q a b € 
90 {q1} ) {91,92} 
qn 1) {q2} {q2} 
q2 g g p 


Table 4.28 Transition Table NFA-€ 


4.6.7 Acceptance of Strings by NFA-€ 


A string is said to be accepted by the NFA-e, if atleast one path exists that starts at the initial 
state and ends in one of the final states. The path formed contains states with transitions 
from €-transitions or non €-transitions. 


In order to identify all states in the actual path, that are formed with €-transitions, the 
concept of ‘e-closure’ is used. 


€-Closure:  €-closure for a state is the set of states reachable from the state, without 
reading any symbol. 

€-closure of a given state ‘q’ is defined as the set of all states of automata, that can be 
reached from g on a path labelled by e€. 


In general, if P is the power set of Q then, 


a. PC €-closure (P). 


b. For q of Q, if g € €-closure (P) then 6 (q, €) E €-closure. 


EXAMPLE 4.6.4: (c-closure for NFA-<) 
start (0) a () € () € 
| b f 
€ 
Ome 


Figure 4.68. Transition Diagram of NFA-€ 
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a. To compute €-closure ({2}): 
{2} C e-closure ({2}); [5 (2,€) =2 > 6(2,€) © e€-closure(2)| 
{2,3,4} € e€-closure ({2}); [6 (2, €) = (3, 4}] 
{2,3,4,5} © €-closure ({2});  [-. 6 (3, €) = {5} and 6 (4,6) =¢ 
further 5 (5, €) = $] 
=> no new members / states can now be added. 
Hence the process of generating €-closure terminates and 
€-closure ({2}) = {2, 3, 4, 5}. 
This means that from the state {2}, the set of states that can be reached without any 
input symbol is {2, 3, 4, 5}. 
b. To compute €-closure ({1}): 
{1} C e-closure ({1}); [eS 60,6é)=1> 5(1,6) E eE-closure ({1}) 
{1,2} C e-closure ({1}); [8(ley=2] 
{1,2,3,4} & e-closure ({1}); [8 (2,€) = (3, 4}] 
{1,2,3,4,5} & e€-closure ({1}); es 6 (3,€) =5 and 6 (4,€) = ¢| 
{1,2,3,4,5} & €-closure ({1}); [-56,€)=4] 


Hence, the process of generating €-closure terminates and €-closure ({2}) = {1, 2, 3, 
4, 5}. 


EXAMPLE 4.6.5: (€-closure for NFA-e) 
0 


aa) anes © oem 
Lo WW 
Figure 4.69. Transition Diagram of NFA-€ 


a. To compute €-closure ({g0}): 


{qo} © €-closure ({g0}), [ -. 5 (qo, €) = go > (go, €) E €-closure ({gG0})] 
{qo. qi} & €-closure ({go}), [8 (go, €) = 1] 
{qo. 91,42} © €-closure ({go}), [* 5 (q1,€) = 42] 
(G0, 91,92} & € -closure ({go}), [° 8 (2. €) = $] 
Hence, the process of generating ¢-closure terminates and e€-closure({go}) = 
{Go. 91,92}. 
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b. To compute €-closure ({q1}): 


{qi} © €-closure({gi}), [8 (qi, €) = a1 > 8 (Gi, €) & €-closure ({g1})] 
{91.42} © €-closure ({qi}), [°° 5(q1,€) = @] 
{91,92} © €-closure ({q1}), [5 (q2,€) = 4] 


Hence, the process of generating €-closure terminates and €-closure ({q1}) = {91,42}. 


4.6.8 Extended Transition function for NFA-€ 


Definition of 5: 


For a state qg and string x, 5 (q, x) is the set of states of NFA - € which it can reach when it 
reads the string x, starting at state g (with or without €). In other words, nondeterministically, 
it goes through a number of states from the state q as it reads the symbol in the string x. 
Thus for an NFA-é (with M = (Q, &, qo, 6, F) the function 6 is extended as 


6:Qx x* > P(Q) 
and is recursively defined as follows: 


a. For any state q of Q, 


5 (q, €) = €-closure ({q}). 


This means, the determination of the set of all states that are reachable to from state 
“q’ with ‘e’ as input symbol is possible. 
b. For any state q of Q, any string x € &* with ‘a’ as the last symbol of x anda € ©, 


5 (q,xa) = €-closure (Ufor every p in g 5 (Pp, @)) 
or 


5 (q, xa) = €-closure (Upes (g,x) « 5 (p, @)) 


This means that 6 (q, xa) is obtained by first finding the states that can be reached 
from q by reading x (6 (q,x)), then from each of those states which can be reached 
from p by reading a (i.e., by finding 5 (p, a)) are found, and finally €-closure (i.e., 
by taking the €-closures of 5 (p, a)) are read. 
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EXAMPLE 4.6.6: To compute 5(0, ab) for the NFA-€ shown in figure 4.70. 
aC =a ae a 
a ro ——— —__> 
- ; ye |: 
oe 
—_—— 
Figure 4.70. Transition Diagram of NFA-€ 


a. To compute €-closure: 
m ¢€-closure({o}): 


{0} C e-closure ({0}), [ -. 5(0, €) = 0 => 80, €) E €-closure ({0})| 
{0,4} C €-closure ({0}), — [-" 8(0,€) = 4] 
{0,4,3} & €-closure ({0}), [*.° 6(4,€) = 3 and further 4 (3, €) = ¢] 


=> {0, 3,4} C €-closure ({0}) 
i.e. , €-closure ({0}) = {0, 3, 4} 


Similarly, 
gw eé-closure ({1}) = {1,2,3}, @  ¢€-closure ({3}) =¢ 
m c¢-closure ({5}) = {5}, m e-closure ({4}) = {4,3} 
b. To compute 6: 
m 65(eé-closure (0), a) = 6 ({0, 3, 4}, a) 
= 6 (0,a) U6 (3, a) US (4,a) 
= {1} U {5} U {5} 
{1,5}. 


gw 65(é€-closure (0), ab) = 6 (€-closure (6 (€-closure (0), a), b) 
= § (€-closure ({1, 5}, b)) 

= 6 ((€-closure (1) U €-closure (5)), b) 
= 6 ({1, 2,3, 5}, b) 

= 6(1,b) U6 (2,b) US (3, b) US (5,b) 
{p} U {4} U (h} U {9} 

{4} 
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Finally, 


€-closure (6 (€-closure (0), ab)) = €-closure (4) 
= {3,4}. 


4.6.9 Language Accepted by NFA-¢ 


A language is accepted by NFA-e€ M, for a string x with a state go € Q, if and only if, 
5 (qo, x) contains atleast one accepting state. 
In other words, NFA-€ for a given string is defined as, 


L(M) = {x|é (qo,x) = P, where P contain atleast one member of F} 


or 


L(M) = {x\8 (qo,x) NF # {@}}. 


This means, an NFA-€ accepts the string x, iff, it can reach an accepting state by reading 
x, starting at the initial state. 


EXAMPLE 4.6.7: Consider the automata of figure 4.70 with the input string w = ab. 


Here, 
€-closure (6 (€-closure ({0}), ab)) = {3, 4}. 
Let P = {3,4}. Since P contains 3 which is the final state, w = ab is accepted by the NFA-e. 
In other words, 


6 (0, ab) = {3,4} F 
= {3,4} {3} 
= {3} 4 = ‘ab’ is accepted. 


Note-5: 

a. ‘€’ is a zero length string, so it can be any where in the input string: in the 
front, at the back or between any symbols. 

b. A transition on reading € means the NFA-e makes transition without 
reading any symbol in the input. This implies that read head does not move 
when € is read. 

c. Note that any NFA is also an NFA with €-moves. 
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4.7 Design of NFA-es 


The basic design strategy for NFA-€ is as follows: 


Under stand the language properties for which the NFA-é has to be designed. 
Determine the alphabet and state set required. 

Identify the initial, accepting and dead states of NFA-e. 

Identify all €-transitions. 

Obtain the non-€-transitions for each state, on each character of the input string. 
Draw the transition table and diagram for NFA-é. 

Compute €-closure for each state. 

Test the NFA-e€ obtained on short strings. 


Samo aoe p 


EXAMPLE 4.7.1: Design an NFA-e that accepts the string abac and all its suffixes. 


Solution: We are required to design an NFA-€ for the regular expression r = c+ac+bac+ 
abac. In other words, an NFA-€, M should be designed to accept the language of r 


ie. L(M) = {€,c, ac, bac, abac} 


where, M is the NFA-e. 
Consider, 


x = {a,b,c}, Q@= {0,1,2,3,4}, go=0, F = {4} 
and 6 is given by: 


Transition diagram: 


€ 


aa a 
~ S200 920 


Figure 4.71. State Transition Diagram to represent 
L(M) = {W é€ =*|W contain abac and all its suffixes.} 
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Transition table: 
x Present Inputs 
Q a b c € 
—>O0 {1} @ @ {1,2,3,4} 

1 @ {2} @ d 

2 |{3} ¢ @ ¢ 

3 @ go {4} @ 

*4 ¢ o> ¢ p 


Table 4.29 State Transition Table 


NFA-< action for the input string 
To show that string ac is accepted by NFA-e: 


a. To compute ¢-closure 
w ¢ -closure ({0}) : 


{0} € €-closure ({0}), [-. 5 (0, €) = {0}] 

{0,1} © €-closure ({0}),  [-. 5 (0, €) = {1}] 

{0, 1,2} & €-closure ({0}),  [-. 5 (0, €) = {2}] 
{0,1,2,3} © €-closure ({0}),  [°.- 8 (0, €) = {3}] 
{0, 1,2,3,4} & €-closure ({0}), [-.° 5 (0,€) = {4}] 


Further, 6 (1,€) = 6 (2,€) = 6 (3,€) = 5 (4, €) = ¢. Hence process terminates 
and €-closure ({0}) = {0, 1, 2,3, 4}. 
@ e€-closure({1}) 
{1} © €-closure ({1}), [+ 8(1,€) = {1}; 
€-closure ({1}) = {1}. 


5(1,€) = ¢)] 
=> 


@ e-closure ({2}) 
{2} © e€-closure ({2}), [8(2,€) = {2}; 5(2,€)=@)] 
=  €-closure ({2}) = {2}. 
gw «-closure ({3}) 
{3} C €-closure ({3}), [ “°6G,6) = {3}; 8B, = ?)] 
=>  e€-closure ({3}) = {3}. 
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m e-closure ({4}) 


{4} © e-closure ({4}),  [°-5(4,6)= {4}; 5(4,6) = @] 
=>  e€-closure ({4}) = {4}. 
b. To compute 6: 

m 65(e-closure ({¢}),a) = 6 ({0, 1, 2,3, 4}, a) 
= 6 (0,a) Ud (1,a) US (2,a) US (3, a) US (4,4) 
={l}UPUBJUPUG 
= {1,3} 

m 645 (¢é-closure ({0}), ac) = 6 (€-closure (6 (€-closure ({0}), a), c)) 
= 6 (€-closure ({1, 3}), c) 
= 6 ({1, 3}, c) 
= 6(1,c) U8 @G,c) 
= @U {4} 
= {4}. 

m e-closure (6 (€-closure ({0}), ab)) = €-closure ({4}) 


= {4}. 
Thus 


6 (O,ac) = {4} F 
= (4) 19 {4} 
={4}46@ = ac is accepted by NFA-e. 


EXAMPLE 4.7.2: Design an NFA with €-moves to accept all the strings with any number 
of a’s followed by any number of b’s followed by any number of c’s. 


Solution: We are to design an NFA-e for the regular expression r = a*b*c*, i.e. NFA-€, M 
is to be designed to accept the language of r 


i.e., L(M) = {€, aa, bb, cc, abc, aabbcc, ...}. 


Consider, 
x = {a,b,c}, O = {40,491,492} 
qo = initial state, F=q2 

and 6 is given by: 
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Transition diagram: 
a i 5 
_o oe = 


Figure 4.72. State Transition Diagram to represet 
L(M) = {w < y*| W contains a’s or b’s or c’s or 


combination a, b,c 


Transition table: 


Present Inputs 
a b Cc € 
{go} ¢ oo {qi} 
@ {qa} @ {42} 
? {gq} @ 
Table 4.30 State Transition Table 


NFA-<€ action for the input string 
To show that string abc is accepted by NFA-e: 


a. To compute €-closure: 
m ¢-closure ({q0}) = {90.91.92} 
m e€-closure ({q1}) = {91,492} 
m ¢-closure ({q2}) = {q2}. 


b. To Compute 5: 
m 4(e-closure ({go}),a) = 5 ({Go, 41, 92},4) 
= 6 (qo, a) US (qi, a) US (q2, a) 
= {qo}. 


m 5(€-closure ({go}), ab) = 5 (€-closure (5 (€-closure ({go}), a), b) 
= 5 (€-closure ({qo}), b) 


= 6 ({g0, 91,92}, 5) 
= 6 (go, b) US (qi, b) U5 (gz, b) 
= {qi}. 


m 6 (e€-closure ({go}), abc) = 6 (€-closure (6 (€-closure ({qo}), ab), c) 
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= 6 (€-closure ({q1}), c) 
= 8 ({q1,92},0) 
= 8 (q1,c) US (q2, 0) 
= (q2}. 
=  €-closure (6 (€-closure ({qo}), abc)) = €-closure ({q2}) 


= {q2}. 


Since {g2} is an accepting state, hence the string abc is accepted. 


EXAMPLE 4.7.3: Design an NFA-€ that accepts the set of all strings over £ = {0, 1}, which 
start and end with 0. 


Solution: We are required to design an NFA-e for the R.E. r = 0(01)*0. 
In other words, NFA-€, M is designed to accept the language of r 


ie., L(M) = {00,010,0110,01010,...}. 
Consider 
Q = {0,1,2,3,4,5,6,7}, go={0}, F={7}, u= {0,1}. 
and 6 is given by: 


Transition diagram: 


ae 0 


sree 


Figure 4.73. State Transition Diagram L(M) = {W &€ %*|W starts and ends with 0} 
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Transition table: 

x | Present Inputs 

Q 0 1 € 
—>0 @ @ {1,2} 

1 {3} @ ) 

2 {44 @ @ 

3 @ {3} {6} 

4 @ {5} @¢ 
5 |e. @ 12.6) 

6 {7} @ g 

*7 @ g {6} 


Table 4.31 State Transition Table 


NFA-<€ action for the input string 
To show that 010 is accepted by NFA-e: 


a. Computing €-closure: 


m e-closure ({0}) = {0, 1, 2} 
€-closure ({1}) = {1} 
€-closure ({2}) = {2} 
€-closure ({3}) = {3, 6} 
€-closure ({4}) = {4}, 
€-closure ({5}) = {5, 2, 6}, 
€-closure ({6}) = {6} 

@ ¢-closure ({7}) = {7}. 


b. To:Compute 6: 

a 5 (€-closure ({0}),0) = 6 ({0, 1, 2}, 0) 
= 6(0,0) U4 (1,0) U4 (2,0) 
= {3,4}. 

m 6(é-closure ({0}),01) = 5 (€-closure (6 (€-closure ({0}), 0), 1) 
= 6 (€-closure ({3, 4}, 1)) 
= 5({3, 6, 4}, 1) 
= 6 (3,1) U5 (6,1) U6(4,1) 
= {3,5}. 

w 5(€-closure ({0}),010) = 6 (€-closure (6 (€-closure ({0}), 01), 0) 
= 6 (e-closure ({3, 5}), 0) 
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= 5({3, 6, 2, 5}, 0) 

= 6 (3,0) U4 (6,0) US (2,0) US (5,0) 

= {7,4}. 

Further, 
€-closure (5 (€-closure ({0}),010)) = €-closure ({7, 4}) 
= €-closure ({7}) U €-closure ({4}) 
= {7,4}. 
Since P = {7,4} contains the final state {7}, the string 010 is accepted by NFA-e. 


EXAMPLE 4.7.4: Design an NFA €-to accept set of all strings over & = {0,1}, with any 
number of 0’s or any number of 1’s. 


Solution: We are to design an NFA-e for the regular expression r = (0)* + (1)*. In other 
words, NFA-€ should be designed to accept the language of r 


i.e., L(M) = {e,0,1,01,0011,...}. 


Consider, 
D = {0,1}, Q = {40, 91, 92}, go = initial state, F = q2 


and 6 is given by: 


Transition diagram: 


(\ (NV 
8) + (a1) e ©)» 


Figure 4.74. State Transition Diagram to represent 
L(M) = {W € &*|W contains 0 or 1 or both}. 


Transition table: 


x | Present Inputs 
Q 0 1 € 
>qo |{qo} @ {41} 
1 @ {ai} {92} 
«go |{a} 6 © 
Table 4.32 State Transition Table 
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NFA-<é action for the input string 
To show that the string 01 is accepted by NFA-e: 


a. Computing €-closure: 


mw ¢€-closure (go) = {90,491,492} 
m ¢-closure (g1) = {41,92} 
m €-closure (q2) = {q2}. 


b. To compute 6: 


m 5 (€-closure (go),0) = 6 ({90, 91.92}, 9) 
= 8 (go, 0) U5 (qi, 0) US (2,0) 
= {qo} UP V {q2} 
= {qo, 92}. 
m 5(€-closure({go}),01) = 6(€-closure(é(€-closure({go}), 0), 1) 
= 6(€-closure({qo, 92}), 1) 
= 6(qgo, 1) U8(qi, 1) U8(q2, 1) 
= @U{qj}U¢ 
= {qi}. 
Further, 
€-closure (6(€-closure ({go}),01) = €-closure({q1}) 
= {91,92}. 
Let P = {q1,q2}. Since P contains {q2}, which is a final state, hence 01 is accepted by 
NFA-e. 
EXAMPLE 4.7.5: Design an NFA with €-moves to accept the set of all strings over 


x = {a, b} that end in a or b, followed by any number of a’s and b’s. 


Solution: We are to design an NFA-é for the R.E. r = aa* + bb*. In other words, an 
NFA-e, M should be designed to accept the language of r 


i.e., L(M) = {a,b, aa, bb,...}. 
Consider, 


X = {a,b}, Q = (qo. 91, 92,93,94}, qo = initial state, F = (q2, q4} 
and 6 is given by: 
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Transition diagram: 


Figure 4.75. State Transition Diagram to represent 


L(M) = {w € x* 


W is either a or any number of a’s 
or b or any number of b’s . 


Transition table: 


Present Inputs 


a b E 
@ oo {41,93} 
gq 6| {qo} p 
*q2 | {a} @ @ 
B @ {qa} @ 
94 o@ {qa} ¢ 
Table 4.33 State Transition Table 


NFA-< action for the input string 
To show that aa is accepted by NFA-€ 


a. Computing ¢-closure: 


@ e-closure ({9o}) = {90,91,93) 
€-closure ({g1}) = {41} 
€-closure ({g3}) = {q3} 
€-closure ({q2}) = {q2} 
€-closure ({g4}) = {qa} 
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b. Computing 6: 
m 45(€-closure({go}), a) = 5({G0, 91,93}, 4) 
= 5(go, 4) U 8(q1, a) U 8(q3, a) 
=PU{gmj}U¢ 
= {q2}. 
@ 5(€-closure({qo}), aa) = 6(€-closure(6(€-closure({qo}), a), a) 
= 5(€-closure({q2}), a) 


= (42,4) 
= {q2}. 
Further, 
€-closure (5(€-closure ({go}), aa) = €-closure({g2}) 
= {q2}. 
Let P = {q2}. Since P contains {q2}, which is a final state, the string aa is accepted by 
NFA-e. 


EXAMPLE 4.7.6: Design an NFA-é that accepts decimal numbers. 


Solution: Any decimal number can be considered. 


a. Sign is optional. 

b. A string of digits (can be empty) must be present. 

c. It must have a decimal point. 

d. Another string of digits can be present (can also be empty). 


Thus, the required NFA-€ can be designed as follows: 


a. Initial state is qo. 

b. The state gi, represents the sign (if any), i.e., there is a transition from qo to gq; on 
any of €, + or — (but no digit or decimal point). 

c. The state g2 represents the situation, where we have just seen the decimal point and 
may or may not have seen the digits prior to it. 

d. State g3 represents that state where we have seen a decimal point and atleast one 
digit, either before or after the decimal point. We may stay at q3 to read the string of 
digits and after the scanning of the complete string, it may reach the final state q4. 

e. The state g5 represents atleast one digit but not the decimal point (or any other 
characters). 


In other words, 


x— contains digits 0-9, sign(+, —)decimal point(-) 
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O—{q0, 91, 92,9394, 95} 
go— initial state, and 
qa— final state. 

Thus, the above NFA-e can be constructed as: 


Transition diagram: 


0, 1, 2, 3...9 0, 1, 2...9 
SEE cael SAREE ant SEE al OO CO 
% Cs 


Figure 4.76. State Transition Diagram NFA-€ 


NFA-<é action for the input string 
To show that x = —1.2 is accepted by NFA-e: 


a. To compute ¢-closure: 


€-closures : €-closure ({go}) = {go.q1},  €-closure ({q1}) = {91,42} 
€-closure ({g2}) = {q2},  €-closure({q3}) = {43,94} 


b. To compute 6: 

m 45 (e-closure ({q0}),—) = 6 ({g0, 91}, -) 
= 6 (go, —) US (q1, -) 
= {41} 

@ 4 (é-closure ({go}),—1) = 5 (€-closure (6 (€-closure ({go}), —)), 1) 
= 6 (€-closure ({q;}), 1) 
= 6 ({91,92}, D 
= 6 (g1,1) U8 (q2, 1) 
= {qi}. 
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m 645 (e-closure ({go}),—1-) = 5 (€-closure (6 (€-closure ({go}), —1)), -) 
= 6 (€-closure ({q1}), +) 
= 6 ({91,92},°) 
= 6 (q1,:) U5 (92, °) 
= {q3}. 
m 45 (é-closure ({go}),—1-2) = 5 (€-closure (6 (€-closure ({go}) — 1-)), 2) 
= 6 (€-closure ({g3}), 2) 


= 6 ({q3, 94}, 2) 
= 6 (q3,2) US (qa, 2) 
= {q3}. 
Further 
€-closure (5 (€-closure ({go}), —1 - 2) = €-closure ({q3}) 
= {q3,q4} 
Let P = {q3, q4}. Since P-conrains one final state {q4}, x = —1 - 2 is accepted. 


EXAMPLE 4.7.7: Construct an NFA-€ to accept either a or abb or a*b* over the alphabets 
x = {a,b}. 


Solution: Let M = (Q,%,q0,5,F) be the NFA -e such that Q = {0,1,2,3,4,5, 
6, 7,8}, & = {a,b},qo = O and F = {2,6,8}. The transition 6 for NFA-e€ is shown in 
figure 4.77. 


a 
—©>" 


Figure 4.77. Transition Diagram for NFA-€ 


EXAMPLE 4.7.8: Construct an NFA-é to accept ‘if’ (with reference to C-programming). 
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Solution: Let M = (Q,%,q0,5,F) be the NFA-e such that Q = {q0,q1,92.43, 94,95, 
qo, }, X is the set of alphabets and special characters, gg = go and F = {qo}. The transition 
6 for NFA-eé is shown in figure 4.78. 


any letters 
or 
special characters 


ee a 
——+ { gg |} ————> { gq, ) -——————— (( Ga ): —_——— (| 3 |) ———— (soF'7*7r ) 


letters 


Figure 4.78. State Transition Diagram for NFA-€ 
EXAMPLE 4.7.9: Construct an NFA-€ to accept the set of all strings over £ = {0,1} 
containing an even number of 0’s or exactly two 1’s. 


Solution: LetM = (Q, Z, qo, 45, F) be the NFA-€ such that Q = {qo, 91, 92, 93, 94,95}, & = 
{0, 1},¢o = qo and F = {q1,q5}. The transition 5 of M to accept strings containing even 
number of 0’s or exactly two l’s is shown in figure 4.79. 


1 
{_) 
© wee © wars 
see” 
0 


€ 


@)A-@)A- © 
7 Lo 


0 0 0 
Figure 4.79. State Transition Diagram NFA-€ 


EXAMPLE 4.7.10: Construct an NFA-€ to accept the set of all strings, over £ = {0, 1}, 
containing a string that begins with ‘1’ and ends with a 0 or a string containing atleast three 
I’s. 
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Solution: Let M = (Q,%,q0,5,F) be the NFA-e such that Q = {90,q1,92,93.94,95, 
96,97,98}, & = {0,1},90 = go and F = {q3,qs}. The transition 5 of M is shown in 
figure 4.80. 


) 
a) —+@) © 
—@ Oo 


Oem © wea aye 
YY 


Figure 4.80. State Transition Diagram NFA-€ 


4.8 Advantages of Non-Deterministic Finite Automata 


The advantages of nondeterminism are: 
a. NFA can be smaller, easier to construct and understand than a DFA that accepts the 
same language. 
b. It is useful for proving some theorems. 


It gives a good introduction to nondeterminism in more powerful computational models, 
where nondeterminism plays an important role. 


d. Nondeterminism is a kind of parallel computation where several processes can be 
running concurrently. 


When NFA splits to follow several choices, that corresponds to a process ‘forking’ into 
several children, each proceeding separately, then the entire computation accepts, if atleast 
one of these accepts. 
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4.9 NEA versus DFA 


a. A DPA is a special case of NFA. 


@ Ina DFA, at every state g and for every symbol ‘a’, has a unique a-transition i.e., 


there is a unique q’ such that g —, q’. So this is not necessarily in an NFA. At any 
state, an NFA may have multiple a-transitions or none. 


@ Ina DFA, transition arrows are labelled by symbols from & while in an NFA, 
they are labelled by symbols from & U {e}, i., an NFA may have 
€-transitions. 


Thus, in general, a DFA is a special case of NFA. 


b. NFAs are very convenient for demonstrating the languages to be regular. Often it is 
much easier to provide an NFA for a language than a DFA. 


c. Space and time taken to recognise regular expressions: 

The determinism of a DFA is important when implementing a program (or physical 
machine) for recognising a regular language. NFAs are not as well suited to language 
recognition because nondeterminism gives the machine the possibility of making 
incorrect choices. In general, when an NFA is used for language recognition, it is 
necessary to backtrack and explore all possible choices which is a time consuming 
activity. 

Thus, NFAs are more compact (in space), but take time to backtrack all 
choices. On the other hand, a DFA takes more space, but saves time (speeds up 
computation). 


d. Transition function. 


For DFA, ‘5’ is definedas 5:Q@x xX —->@Q. 
For NFA, ’5’ is defined as 6: Q x (XU €) > P(Q). 


e. NFA computation can be represented as a tree structure, as shown in figure 
4.81. 


In figure 4.81, the root of the tree corresponds to the starting state of the computation 
and every branching point in the tree corresponds to a point in the process of computation, 
at which the NFA has multiple choices. The NFA accepts a string only when atleast one of 
the computation branchs ends in an accept state. 
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Start 
Start ( a ° 
ae accept or reject reject ® Ad 
(a) (b) 6 
® accept 


Figure 4.81. (a) Unidirectional DFA Computation. (b) Multidirectional NFA 
Computation 


4.10 Exercises 


4.10.1 Deterministic Finite Automata 


1. What is a DFA? Present the formal definition of a DFA. 
2. Write out the five-tuples for the DFA shown in figure 4.82. 


ra0 
—Cr* 
SO 


Figure 4.82. State Transition Diagram 
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3. Trace the DFA shown in figure 4.83, on the following inputs. 
a.010101 b.0000 c.111 
Which of these inputs are accepted and which are rejected? 


0 


ae 
= 
K__0 
1 


Ron 


Figure 4.83. State Transition Diagram 


4. Construct a DFA accepting each of the following languages. 
a. {W é {a,b}* : W has neither aa nor bb as a substring } 
b. {W é {a,b}* : W has both ab and ba as substrings} 


5. Draw a state diagram for DFA over {a, b}, that on input w produces output a”, where 
n is the number of occurrences of the substring ab in w. 


6. Obtain a DFA to accept the language L = {w: |w| mod 8 4 0} on X = {a, 5}. 


7. Obtain a DFA to accepts strings of a’s and b’s such that, each block of 5 consecutive 
symbols has atleast two a’s. 


4.10.2 Nondeterministic Finite Automata 


1. What is NFA? Present the formal definition of an NFA. 
2. Differenciate between NFA and DFA. 
3. Write out the five-tuples for the NFA shown in figure 4.84. 


a, b,c a, b,c 


abac 
pe —_—_—____——__> 


Figure 4.84. State Transition Diagram 


4. Trace the NFA shown in figure 4.85, on the following inputs: 
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a.ab b.abab c.aba_ d.abaa. 
Which of these inputs are accepted and which are rejected? 


=O 
ba oe 
| a 
oo 
Figure 4.85. | State Transition Diagram 


5. Construct an NFA to accept strings of a’s and b’s such that, each block of five 
consecutive symbols have atleast two a’s. 


6. Obtain an NFA to recognise the strings abc, abd and aacd on & = {a, b,c, d}. 
Find a simple NFA accepting (ab + aab + aba)*. 
8. Obtain an NFA to accept the following language. 


Pa 


L = {W|W € abab” or aba" where n > 0}. 


4.10.3 Nondeterministic Finite Automata with €-moves 
1. What is NFA-é€? Present the formal definition of an NFA-eé. 
2. Differenciate between NFA and NFA-e. 
3. Write out the five-tuples for NFA-€ shown in figure 4.86. 


po © 
on, 


Figure 4.86. State Transition Diagram 
4. What is €-closure? Find €-closure for the NFA-é given in figures 4.87(a) and (b). 
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a b 


{_) Cc 
—~(#)—£—- @)) 


Figure 4.87.(a) State Transition Diagram 


Figure 4.87.(b) State Transition Diagram 


5. Design an NFA-€ to accept the string of a’s and b’s, such that, it can accept either 
the string consisting of one a followed by any number of a’s or one b followed by any 
number of b’s. 

6. Design an NFA-e for the regular expression a* + b* + c*. 


7. What is the language accepted by the NFA-€ given in figure 4.88? 


a 


a 
2 


Figure 4.88. State Transition Diagram 
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‘Transitions enhance changes, equivalence suppress changes.’ 


Introduction 


The key idea of this chapter is to introduce and discuss certain new concepts based on 
the study made in the previous chapters. The concept of equivalence of two computational 
models, associated theorems and examples are discussed in detail. 


5.1 Equivalent Finite-State Automata 


The set of words accepted by a FSA, M, is the language accepted (or recognised) by M and 
is denoted by L(M). 


5.1.1 Definition 


Two finite automata M; and M2 (possibly of different types) are equivalent, if and only if 
L(M)) = L(M2), 
i.e., M, and M2 accept or recognise the same language. 
The equivalences of finite automata are: 
a. equivalence of NFA and DFA or equivalence of NFA-€ and DFA, 
b. equivalence of NFA-€ and NFA. 


5.2 Equivalence of NFA/NFA-e and DFA 


5.2.1 Definition 


Two finite automata N and D are said to be equivalent, if 
L(N) = L(D) 
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where, D represents deterministic finite automata, and N represents nondeterministic finite 
automata. 


That is, N and D accept the same language. This means, 


w any language accepted by D can also be accepted by some N, and 
w any language accepted by N can also be accepted by some D. 


Thus, for any language described by some N, there is a D, i.e., for every N, it is easy to 
construct an equivalent D that accepts the same language. In the worst case, however the 
smallest D can have 2" states while the smallest N for the same language has only n states. 


Theorem I 


A language L is accepted by some NFA, if and only if, it is accepted by some DFA. 
Alternatively, for every NFA, there exists a DFA, that accepts the same language. 


Solution: 
This theorem, has two parts to prove: 
a. if L is accepted by DFA, D then L is accepted by some NFA N 
b. if L is accepted by NFA, N then L is accepted by some DFA D. 
Proof 


Part a: If L is accepted by D, then L is accepted by some N. 
In order to prove this, let us compare the definitions of N and D. 


Definition ‘D’: A DFA, D is defined by the 5-tuple 
D=(Q',%,8',90,F’) 
where Q’ — Finite set of states 


x — Finite set of symbols, input alphabet 


8’ — Transition function 5’ : Q’ x X > Q’ 


Jo — initial state 
F’ — set of final states F’ C Q’. 


Definition ‘N’: An NFA N is defined by the 5-tuple 
N= (Q,%,4,q0, F) 
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where QO — Finite set of states 
x — Finite set of symbols, input alphabet 
6 — Transition functions : @ x & > 22 
go — initial state 
F — set of final states F € Q. 


From the above definitions, it follows that every DFA is also an NFA, which implies that 
if W € L(D), then W e€ L(N). 


Part b: If is accepted by N, then L is accepted by some D. In order to prove this, 
let N and D be the NFA and DFA such that, states of D are Q! = 22 ie. 
all the states of D are subsets of the set of states of N. 


The initial state of D is the initial state of N i.e. % = {qo}. The final state of D will be 
any state of D that contains a final state of N. The first next state of D is determined from 
the initial state followed by successive states. 

Here, 6’ is defined as follows: 


5’ ({91, 92; os - gi}, 4) = {P1,P2, o+ - Pj), 


if and only if 


8({91,92+---4i},4) = {p1,p2,-.- pj}. 


On applying 6 to each of qj, .. .q; and taking the union we get new set of states p), . . . pj. 
Now we prove that for some input string ‘x’, 


8'(qo,*) = {91,92; tee qi} iff, 5(go, x) = {q1,.92, cee qi}. 


We prove this by induction. 
Base case: The result is true for |x| = 0, if x = €, because 
5’ (go. x) = {go} and 5(qo,x) = {qo}. 


Hence, 5’(q,x) = {qo} iff 5(qo,x) = {40}. 

Let us assume that the result is true for each string of length n. Now, we shall show that 
this result is true for any string of length (n + 1). 

Let w = x, with |[w| = (nm + 1) and |x| = n anda € L. Thus by induction, 


5'(qo.X) = {P1,P2,--- Pj} 
iff 5(qo,x) = {p1,p2,.-..pj} 
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where, {p1,...,pj} are states of N. 
By definition of 5’, 


8'({p1,... pj}, a) = {r},...1rk} 
iff 5({p1,...pj},a) = {ri,... re}. 


8'(qo,xa) = 5'(5'(qo, x), a) 
= 8'({p1,.-..pj},a) [using (1)] 
= {r1,ro,..-rr} {using (2)} 


Hence, the result is true for |w| = n + 1, when the result is true for a string of length n. 
Now, 5’(q9,x) € F’, exactly when 5(g0, x) € F. 


=> L(N) = L(D). 


Thus, for every NFA there exists an equivalent DFA, which accepts the same language. 


5.2.2 Conversion Algorithm for NFA/NFA-e€ to DFA 


Let the DFA be D and the NFA be N. The algorithm is called Powerset or Subset Construction 
because for any NFA, N with ‘n’ states, the corresponding DFA, D can have 2" states. The 
algorithm builds the DFA, by discarding all states of NFA that cannot be reached from the 
start state using any possible sequence of transitions. The algorithm is as follows: 


Subset Construction Algorithm 


a. Construct the start state Jp OF {go}, consisting of go and all the states of NFA that 
can be reached from qo by one or several €-transitions. 
Mark gp as ‘unfinished’. 


. While there are ‘unfinished’ states, 


@ take an ‘unfinished’ state s 

m foreacha € &, let 5(s,a) = Uges{tlg 4+ t}. If 5(s, a) is neither ‘finished’ nor 
‘unfinished’ yet, then mark 5(s, a) as ‘unfinished’. 

@ mark s as ‘finished’. 


. Mark all states that contain a final state from N as the final state of D. 
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Note-1: 


m Inthe transition table, unfinished states are those, whose rows are incomplete 
and finished states are those, whose rows are complete. 


@ Foran NFA with n states, the DFA can have 2° states. But, it is not necessary 
to construct ‘5’ for all these 2" states. It can be constructed only for those 
states which are reachable from the initial state. This is because the interest 
is only in constructing the equivalent DFA. 


EXAMPLE 5.2.1: Construct a DFA, equivalent to the NFA given in figure 5.1. 


(ON : 
a = 
a 


Figure 5.1. State Diagram 


Solution: 
The transition table of the above NFA is: 


p>) Present Inputs —| 
Q 0 1 
qo {go.q1} {qi} No. of states in NFA = 2. 
1 d {go 41} Possible states in DFA is 2? = 4. 


Table 5.1 State Transition Table 


Consider, 
6/0 1 
qo = {go} = | {qo} < unfinished 

@ Process {qo} 

5 | 0 1 
5(qo,0) = {90.91},5(go,1) = {qi} =>) {Go} | {40.91} | {a1} | < finished 

{q0, 91} < unfinished 
{qi} < unfinished 
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a Process {qo,q1} 


3({go, qi}, 0) = 5(go, 9) U 5(q1,0) 
= {90,91} U {91} = {90.91} > 
8({qo, 41}, 1) = 8(go, 1) Ud(q1, 1) 


= {q1} U {41,90} = {90.91} < unfinished 


@ Process {q\} 


{qi} 
{qo. 41} 
{go, 91} 
d 


{90,91} 


8({q1},0) = %,6(41, 1) = {90.91} => {qo, 91} 


g Process > 
6(¢,0) = 8¢,1l)=¢ 


Thus, the equivalent DFA is D = (Q’, 2,6’, qo, F’) where QO’ = {{q0}, (go, 91}, {41}, 9}, 
= ({q0. 91}, {q1}) and 4’ is shown in figure 5.2. 


=O). 
See 


0,1 


Figure 5.2. Transition Diagram for the DFA 
EXAMPLE 5.2.2: Construct a DFA, equivalent to NFA given in figure 5.3. 


y 0) 
“_@) +) 
——_—> 

Pie Mee 


Figure 5.3. State Transition Diagram. 
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Solution: 
Transition table of NFA is given below. 


x Present Inputs 
Q 0 1 
go {qo} {q1} No. of states in NFA = 2 
q1 {q1} (qo. 91) Possible states in DFA => 2? = 4. 


Table 5.2 State Transition Table NFA 


Consider, 


5 


qo = {qo} => | {go} |_| < unfinished 


a@ Process {qo} 


<- finished 
< unfinished 


5(qo,9) = {qo}, 5(go,1) = {qi} => 


@ Process {qi} 


6(41,0) = {91}, 5(91, 1) = {90,91} => 


<~ unfinished 


a Process {qo,q1} 


5({qo,41},0) = 45(go,0) U (qi, 0) 1 
=qUq ={9go.91}) => {q1} 

8({q0,91},1) = 8(go, 1) US(qi, 1) {qo, 41} 
= {40,91} {qo,91} | {90,41} | {90,91} 


Thus, the equivalent DFA is D = (Q’, 2, 4’,qo,F’), where Q’ = ({go}, {91}, {90,91}), 
F’ = ({q1}, {q0,91}) and 8’ is shown in figure 5.4. 
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Figure 5.4. State Transition Diagram for the DFA. 


EXAMPLE 5.2.3: Construct a DFA equivalent to NFA given in figure 5.5. 
(2 > 
Ga=O 
a b 
~~ cA 
Figure 5.5. State Transition Diagram 


Solution: 
Transition table of the above NFA is given below: 


> Present Inputs 
Q a b 
qo {qo, 91} {q2} 
11 {qo} {q1} No. of NFA states = 3 
q2 0) {go. 91} Possible DFA states = 2? = 8 


Table 5.3. State Transition Table NFA 


Consider, 


r) a b 
{qo} = G0 => | {G0} < unfinished 


® Process {qo} 
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5 | a [b 
8(go, 4) = {40,41}, 5(qo,6) = {92} =) {40} | {90,91}/{92}| < finished 
nny ~ | unfinished 


@ Process {qo,q1} 


3({qo, 41},@) = 5(go, a) U 8(q1,4) 


< unfinished 
< unfinished 


8({q0,91},b) = 5(go,b) U5(qi Ub) 
= {41,92} 


@ Process {q2} 


r) a b 
8(q2,a) = $,8(q2,b) = {90,91} =>} {40} | {90,91 | (a2) | 
{qo,91} | {90,41} | {91,92} 


{92} 1) {q0, 91} 
{41,92} < unfinished 
1) < unfinished 


@ Process {q\,4q2} 


6 a b 
8({q1,92},a) = 8(q1,a) Ud(q2,a) {qo} | {40,41} | {a2} 
= qo U¢ = {qo} => | {90,91} | {40.491} | {91,492} 
5({41,92},5) = 5(q1, 5) U 5(q2,b) {92} p {qo, qi} 
= 1 U {go.91} = {90.91} {91,92} | {90} | {40,41} 
Similarly, (, a) = 6(¢,b) =@ d d d 


The equivalent DFA is D = (Q’, 2, 4’, qo, F’) where Q’ = ({q0}, {90,91}, {92}, {91,42}, }, 
= ({q92}, {¢1.92}) and 8 is shown in figure 5.6. 


‘xe - 


Figure 5.6. State Transition Diagram for the DFA 


sta (10) 
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EXAMPLE 5.2.4: Construct a DFA equivalent to NFA given in figure 5.7. 


1 


CN 
=—-© @)D 


ie ee 


Figure 5.7. State Transition Diagram 


Solution: 
The transition table of above NFA is given below. 


x Present Inputs 
Lo 4 {qo} {91} 
? {qi 90} 


Table 5.4 State Transition Table NFA 


No. of NFA states = 2 
Possible DFA states = 2? = 4 


Consider, 


| § (0 1 . 
go = {qo} => | {go} | _| <— unfinished 


@ Process {qo} 


5(qgo,9) = {90},5(q1,1) = {gi} => 


< unfinished 


@ Process {q\} 


5(q1,9) = $,5(41, 1) = {91,90} > {qi} 


{qi} {91,90} 
{41,490} < unfinished 
r < unfinished 
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@ Process {q\,4q0} 


8({q0,91},0) = 8(go,0) U5(q1,0) 6 | 0] 1 
= q0 Ud = {qo} {go} | {go} | {41} 
8({qo,91},1) = 48(go,1) Ud@, 1) => {a1} @ | {41,490} 
= 1 U {91,90} = {91.90} {91,40} | {go} | {41.40} 


Similarly, 5(@,0) = 6(¢, 1) = @ d p ? 


Thus, the equivalent DFA is D = (Q’, , 5’, qo, F’), where Q’ = ({g0}, {91}, {91,90}, {¢}), 
F’ = ({qo}, {q1,40}) and 5’ is shown in figure 5.8. 


C@—— © +—- OD 


“og 


Figure 5.8. State Transition Diagram for the DFA 


EXAMPLE 5.2.5: Construct a DFA equivalent to NFA given in figure 5.9. 


1 


ys 
- a m 
——_———»> — 


Figure 5.9. State Transition Diagram 


Solution: 
Transition table of NFA is given below. 


Present Inputs 
1 


{go,91} = {qu} 
{92} {q2} | No. of NFA states = 3 
¢ {q2} | Possible DFA states = 23 = 8 


Table 5.5 State Transition Table 
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Consider, 


go={9o} =| {go} | | < unfinished 


m Process {qo} 


1 
5(qo, 0) = {go,91},5(go.1) = {qi} =| {ao} | {40,41} | {ai} 
{q0,91} < unfinished 
{qi} < unfinished 
@ Process {qo,q1} 
r) 0 1 


{qi} 


5({qo.41},0)= 8(qo, 0) U 6(q1,0) {qo} | {40.41} 
{91,92} 


= {90,91} U {q2} ={¢0.91,92}=>| {90,91} |{G0, 91.492} 


8({go, 91}, 1)= 8(go, 1) U 8(qi, 1) {qi} <unfinished 
= Ug = {91,42} {90, 91, 92} <unfinished 
{91,92} <unfinished 


@ Process {q\} 


5(q1,0) = {q2},6(q1,1) ={q2} => {qo} {go,91} | {a1} 
{qo,91} | {90.91.92} {41,92} 
{q1} {92} {q2} 
(40, 91,92} <4nfinished 
{91,92} <infinished 
{q2} <infinished 


@ Process (qo, 41,92}: 


0 1 
5({q0, 91,92}, 0 5(qo, 0) U 5(q1,0) U5(q2,0) {go} | {g0,41} | {a1} 
={9o.91} U {q2} U{q2} => | {40,91} Kao. 491, 92}{91, 92} 


={90, 91,92} {qi} {92} {92} 
5({qo. 91,92}, 1}=8(go, 1) U8(q1, 1) U 8(g2, Dgo, 91, g2}Kg0. 91, g2}Kq1, 92} 


=q1 Uq2 Uq2={41, 92} {91,92} <unfinished 
{q2} unfinished 
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@ Process {q\,4q2} 


5 0 1 
a = 6(q1,0) Ud(q2,0) = {q2} {qo} {go,9i1} | {qi} 
5({q1, 92}, 1) = 8(q1, 1) UV 8(q2, 1) = {g2} =>} {40,91} 40,491, 42}f41, 92} 

{qi} {92} {q2} 


{40,91, 92}K G0, 41, 92}{91, 92} 
{91,92} {q2} {q2} 


{92} unfinished 
@ Process {q2} 
5 0 1 
5(q2,0) = $, 8(g2, 1) = {92} {go} {90,91} {qi} 
Similarly, 5(¢,0) = 6(¢,1l) =@ => {qi} {q2} {q2} 
{q2} p {92} 
{0,41} | {90.91.92} | {41.42} 
{91,92} {q2} {q2} 
{40.91.92} | {40,491,492} | {91,92} 
) ty) p 


Thus, the equivalent DFA is D = (Q’, Z, 5’, qo, F’), where 
QO’ = ({qo}, {91}, {92}, (40, 91}, {41,92}, (G0. 91, 923); 
F’ = ({q1}, {qo. 91}, (91, 92}, (40, 91,.92}) and 4’ is shown in figure 5.10. 


) 


{0 ) 
1 1 
oO a 
Pee ee 


Figure 5.10. State Transition Diagram of DFA 
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EXAMPLE 5.2.6: Construct a DFA, for the NFA shown in figure 5.11. 


0 1 
fe 7 
=O : 
Ne =$ULL_” 
> wy 


Figure 5.11. State Transition Diagram 


Solution: 


Transition table for above NFA is shown in table 5.6. 


Present Inputs 
0 1 
{qi} ¢ 


(qo, 92} {91,92} 
{qi} {q2} 


Table 5.6 State Transition Table 


re 
go = {go} => | {0} | 


Consider, 


r | Process {qo} 


No. of states NFA = 3. 
Possible states DFA = 23 = 8. 


< unfinished 


5 | 0 


1 


5(qgo, 9) = {91},6(G0,1) = = | {Go} | {a1} 


{q1} 
@ 


4 
p 
< unfinished 
< unfinished 


@ Process {q\} 


[3 0 


1 


{qo} | {ai} 


¢ 


5(q1,0) = (qo. 92},6(q1, 1) = {41.92} = | {ai} | {90.92}! {41.492} 


Similarly, 5(¢,0) = 5(¢,1) =@ @ 


{go, 92} 
{91,92} 


<4unfinished 
<unfinished 
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m Process {qo, q2} 


— 


5 0 1 
Hae = 5(q0,9) U 8(q2, 0) = {q1} {go} | {41} p 
5({40, 92}, 1) = 8(go, 1) U 8(@2, 1) = {qa} => i {91,92} 
{q0,92}} {qi} | {a2} 
{91,92} < unfinished 
{q2} < unfinished 
@ Process {q\,4q2} 
5 0 1 
5({91,.92},0)=8(q1, 0) U 5(q2, 0) {qo} {qi} ¢ 
= {90,92} U {qi} ={90,91.92}| {qi} {qo.92} \{q1,.92} 
3({q1,.92}, 1)=8(q1, 1) U5(q2, 1) @ @ d 
={q1, 92} U {42} {q0, 92} {q1} {92} 
={41, 92} {91,92} |(90,91,92}|{91,92} 
{q2} <—unfinished 
{go, 91,92} <unfinished 
@ Process {q2} 
5 0 |] 1 
8(q2,0) = 41,5(q2, 1) = qa > {go} {q1} p 
{q1} {90,92} |{41,92} 
@ ? “ 
(qo, 92} {qi} {q2} 
{91,92} |{90,91,92}|{91, 92} 
{q2} {q1} {92} 
(qo, 91,92} <unfinished 
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@ Process {qo, 41,92} 


5 0 1 
5({q0,41,92},0) = 5(go, 0) U 6(q1,0) US(qz, | {go} {q1} @ 
= VU {(q1,92}Uq1 {q1} {qo.42} |{41,92} 
= {90,941,942} vy) i) g 
5({q0,91,92}, 1) = 8(go, 1) US(q1, 1) U8(q2, 1)|_ {40,92} {qi} {q2} 
=@U {91,92} U {92} {91,92} |{90,91,492}|{q1, 92} 
= (91.92) | {qo} (a) | {a2} 
{go, 91, 92}| {qo, 91, 92} |{91, 92} 


Thus, the equivalent DFA is 


D=(Q’,2,8',99,F’), 


where Q'=({qo}, {qi}, {92}; {q0. 92}; (91, 92}, (go, 91, 92}, 6), F’=({a1}, (91, 92}, (40, 91, 92}) 
and 5’ is shown in figure 5.12. 


oS 
a 


~ | 
! 


0 0 


Figure 5.12. State Transition Diagram of DFA 
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5.2.3 Construction of a DFA equivalent to NFA-€ 


To obtain the DFA from the given NFA-e, the subset construction algorithm is used. The 
following points have to be noted for €-transitions. 


e Start state go consists of start state go and all states, reachable from qo by €-transition. 
e For any state processing q’, add all transitions from q’ reachable with e€. 


EXAMPLE 5.2.7: Construct a DFA, equivalent to the NFA given in figure 5.13. 


1 
el ee 


Figure 5.13. State Transition Diagram 


Solution: 
Transition Table for NFA-€ is given below: 


@ {q1} {q2} 
{qo,92} {92} @ 
@ ¢ d 


Table 5.7. State Transition Table 


No. of states NFA-e = 3 
Possible DFA states = 2? = 8 


Consider, 
qo consisting of start state go and all states reachable from qo by €-transition. 


ry) 0 1 
=> go = {go.92} => | {40,92} < unfinished 


@ Process {qo, 42}: add all transitions from {go, q2}, reachable with €. 


5 ({q0,92},0) = 8 (go,0) US (q2,0) = 


5 ({Go. 492}, 1) = 68 (go, 1) US (qo, 1) = {q1} = 
< unfinished 
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m Process {q,}: add all transitions from q;, reachable with €. 


) 0 1 
5 (41,9) = {90,92}, 5(g1,1) = {q2} => | {40,92} p {qi} 
@ d ¢ 
{qi} | {40,92} {42} 


| {a} | 6  ¢ 
w Process {q2}: add all transitions from g2, reachable from eé. 


6 (q2,0) =8 (a2, I =¢ 
Thus, the equivalent DFA is D = (Q’, X, 6’, qo, F’) where 
QO’ = ({90, 92}. ¢, {qi}, {92}), F’ = ({G0.q2}) and 5’ is shown in figure 5.14. 


= —— 
0, 1 


ee 


0 
0, 1 


Figure 5.14. State Transition Diagram for DFA. 


EXAMPLE 5.2.8: Construct a DFA, equivalent to NFA given in figure 5.15. 


— : ) 
ee ‘ee, NES 


Figure 5.15. State Transition Diagram. 


Solution: 
Transition table for NFA-€ is given below. 
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x Present Inputs 
Q a b c E 
qo {qo} © oo {qi} 
1 @ {a1} © {42} | No. of states NFA = 3 


q2 Y) @ {q2} © | Possible DFA states = 2? = 8 


Table 5.8 State Transition Table 


Consider, go consisting of start state go and all states reachable from qo by €-transition. 


| 5 ab c| 
= go = {90.91.92} => | {90.91.92} | < unfinished 


m Process {qo, 41,42} 


8({q0, 91. 92},4)= 5(go, a) U 8(q1, a) U 8(q2, a) 


= {q0. 91,92} {q0, 91> 92}{G0, 91, 9241, 92}Kg2 
8({q0, 91,92}, b)= 5(go, b) U 8(q1, b) U 8(q2,b) | {91,92} <unfinished 
= {41,92} «unfinished 
3({go, 91,92},¢)= 5(go, c) U 5(q1, c) U 6(q2, c) 
= {q2} 


@ Process {q\,q2} 


r) a b Cc 
5 ({91,92},4) =8(q1,4) US(q2.a) =¢ {90,91,92}| {40,91 92}|{41, 92} | {42} 
5 ({91,92},5) = 5 (q1,b) US (gz, b) = {41,92} >| {91,492} & — |{41,42}){92} 
5 ({91.92},¢) = 6 (qi,c) US (qa, c) = {92} {q2} @ @ |{92} 

p ? | @ p 


@ Process {q2} 
5 (q2,a) = § (q2,b) = , 8 (q2,c) = q2 
Thus, the equivalent DFA is D = (Q’, , 6’, qo, F’), where 


QO’ = ({q0, 41,92}, {91.92}, {92},0), F’ = ({Go. 91.492}, {41,92},92) and 5’ is shown 
in figure 5.16. 
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start CR 
* \{do.41-92} ig 


J 


> 


Figure 5.16. State Transition Diagram of DFA. 
EXAMPLE 5.2.9: Construct a DFA, equivalent NFA given in figure 5.17. 


oo 


__Start_ 


Figure 5.17. State Transition Diagram. 


Solution: 
Transition table for the above NFA is given below. 


x Present Inputs 


Q a b E 
qo {qo} {a2} {1} 
1 {a1} {q2}  @ | No.of states NFA =3 


2 {qo.92} {q1} | Possible DFA states = 2? = 8 
Table 5.9 State Transition Table 


Consider, go consisting of start state go and all states reachable from go by é. 


r) a b 
=> go = {go.91} => | {90,91} < unfinished 
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@ Process {qo,q1} 


5 ({qo. 41}.4) = 5 (go,a) U5 (q1,4) 
= (90,91) U1 | * (go,@) + states reachable from go with € *| 
= {90,91} 
b 


5 ({G0,41},b) = 65 (qo.b) US (qi, b) {qo, 91} | {40,91} | {q2} 
= qo U q2 = {92} >| 


< unfinished 


@ Process {q2} 


{go.91} | {q2} 
{qo, 91,92} 


{qo, 91} 
{92} 
{q0, 91,92} 
{q1} 


5 (42,4) = {90.91,92} > 


5 (q2,b) = {q1} 
< unfinished 


< unfinished 


@ Process {qo, 41,92} 
5({40, 91,42}, a}=4 (go, a)U5(q1,a)U5(q2, a) 
={go,91} U qi U (90, 91,92} 


{qo,91} | {90,91} | {92} 


= (90, 91,92} => {q2} {40.91.92} {41} 
3({q0, 41, 92}, b= 5(go, b)U5(qi, b)US5(q2,b) (go, 91,92}{90. 91, 92}K91,92} 
=gmUqUq {q1} <unfinished 
={q1, 92} {41,92} <anfinished 
@ Process {q\} 
d(q.a=q, §(91,b)=qQ2 
m Process {q1,4q2} 
5 ({91,92},4) =68(q1,a) US (q2, a) 5 a b 
= q1 U {90.92.91} {q0.91} {qo,q1} {q2} 
= {q0. 91,92} {q2} {90,941,942} | {91} 
8 ({91,92},b) =8(qi,b)U8(q2,b) | {qo.41,492} | {90.91.92} | {91,42} 
=mUq => {qi} {qi} {q2} 
= {91,92} {91,92} | {90.91.92} | {91.42} 
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Thus, the equivalent DFA is D = (Q’, , 5’, qo, F’) where 


Q = ({g0. qi}, {92}, {qo, 1,92}; {qu}, {q1,42}). F = ({g0, a1}; {g0, 71> q2}, {41}, {q1, q2}) 
and 8’ is shown in Figure 5.18. 


Figure 5.18. Equivalent DFA. 


EXAMPLE 5.2.10: Construct DFA, equivalant for NFA given in figure 5.19. 


ae came : 
a 
On ee 


\a@a 


Figure 5.19. State Transition Diagram. 


Solution: 
Transition table for the above NFA is given below. 


x Present Inputs 
Q a b E 
qo ia) ¢ e | 
41 {a} @ {q2} 
Q % {40} ? 


Table 5.10 State Transition Table 


Consider, 
go = {90} => | {go} |_| <— unfinished 
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gu Process {qo}: add all transitions from go, reachable with €. 


Pr) a b 

5 (go, 4) = {91.92}, 5(Go.b)=¢ > {go} {91,92} ¢ 
{41,42} <unfinished 
tu) <unfinished 


® Process {q\, q2}: add all transitions from g;, q2, reachable with €. 


6 | a b 
5 ({41,92},@) = {41.92}, § ({91,92},5) = {go} > | {90} | {91.492} ¢ 


{91,92} | {91.492} {go} 
d @ $¢ 


Thus, the equivalent DFA is D = (Q', X,65’,q), F’), where Q! = ({qo0}, {91,92}, 6). 
F' = ({q1,92}) and 8’ is shown in figure 5.20. 


(W)C, 


Figure 5.20. Equivalent DFA. 


5.3. Equivalence of NFA, with e-Moves to NFA, without 


e-Moves 


5.3.1 Definition 


Two finite automata N¢ and N are said to be equivalent if 
L(Ne) = LW), 


where N< represents an NFA with €-moves and N represents NFA without €-moves. 
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This means that for any language described by some N¢, there is an N that accepts the 
same language. 


Theorem II 

For every NFA, with €-moves, there exists an NFA, without €-moves, that accepts 
the same language. Alternatively, if L is accepted by an NFA with €-moves, then L is 
accepted by an NFA without €-moves. 

Proof: 

Recall that, an NFA with €-moves (N_) is defined by the 5-tuple, 


Ne = (Q, x,5,90,F), 


where Q — Finite set of states 
x -— finite set of input symbols. 
5-QOx DU {E} to 22 
qo — initial state 
F — set of final stats F C Q. 


An NFA, without €-moves (N), is defined by the 5-tuple, 
N= (Q, pa 5, qo, F’), 


where Q — Finite set of states 
x’ — Finite set of input symbols 
é'-Qx X to 2? 
qo — initial state 
F’ — set of final states F’ C Q. 


From the above definitions, NFA without €-moves can be constructed from NFA with 
€-moves (N.-), where 
x’ = = — {e} 


with 


F'= F U{qo}, if € closure (go) contain a state of F, 
a F otherwise. 


and 6’(q, a) = 5 (q,a) for g € Qanda € Lie., both Ne and N accept the language. 
We prove this by induction on |x| so that, 5’(qo,x) = 5 (go, x). 
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Base case: The result is true for |x| = 0, if x =e because, 5’(go,€) = {qo} and 
5 (go, €) =€-closure (qo) 


Hence 5’(qo, x) = {qo} iff 5 (qo, x) = {90} 
The result is also true for |x| = 1 with x = ‘a’ 


=> 8’(qo,a) = 6 (qo, a) 


Induction: Let us now assume that the result is true for each string of length n. Now we 
shall show that this result is true for a string of length (n + 1) i.e |x| > 1. 


Let x = wa. We need to prove that 5’(go, wa) = 5 (qo, wa). 


Since 


5'(qo, wa) = 5'(5' (qo, w),a), 


let 5 (go, w) = A and for any w, 5’ (qo, w) = 5 (go, w) 


8’(qo, wa) = 5'(5 (go, w), a) 
= §'(A, a). (from (1)) 
We are required to prove: 


5'(qo, wa) = 5 (qo, wa) 
ie. ,5'(A,a) = 5 (go, wa). 


Since 8'(A,a) = U 8/(q,a) 
qinA 


= (ue @o 
and U 4$(q,a) = 4 (A,a) 
qinA 
= 6 (6 (go, W), a) [using 5(go, w) = A] 


= 5 (go, wa). 
From (3) and (4), 5’(A, a) = 6 (qo, wa). 


Hence, 5’(qo, wa) = 5 (qo, wa). 
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5.3.2. Procedure to Construct an Equivalent of NFA with €-moves to 
an NFA without €-moves 


To obtain NFA, NV from NFA-e, Ne, elimination of €-transitions from a given automaton is 
required. But, simply eliminating €-transitions from Ne¢ will change the language accepted 
by the automata. Thus, if €-transitions are eliminated then some non-€-transitions should 
be added as substitutes, in order to maintain the language acceptance by the automata. 
Hence, the procedure to transfer N< to N requires finding the non-€-transitions to be added 
to automaton, for every €-transition to be eliminated. 


For Ne = (Q, 2,5,q0,F) and N = (Q, &’,8’, qo, F’), find 


a. 6/(q,a) =€-closure (5 (€-closure (q), a)) and, 


b F= F U{qo} if €-closure (gq) contains a member of F. 
: F’=F otherwise. 


Note-2: 
When transforming N¢ to N, only transitions are required to be changed and, the 
states are not required to be changed. 


EXAMPLE 5.3.1: Construct an equivalent NFA without €-moves for a given automaton: 


___ —_- a ae 
U 


Figure 5.21. State Transition Diagram. 


Solution: 
Given Ne = (Q, 2,5, qo, F) where 6 is given by 


5 0 1 2 | «€ 
go | {qo} d vy) {qi} 
q1 vy) {qi} 1) {q2} 
qQ2 W) Y) {q2} W) 


Table 5.11 State Transition Table 
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Let N = (Q, pe Ce q0, F’). 
Compute F’: 


€-closure (go) = (40, 91,92} 
=> F’ =F U {qo} if closure {go} contains a member of F 


F’ = {q0, q2} 


Compute 8’: 

m 95’(qo,0) =€-closure (5 (€-closure (go), 0)) 
=€-closure (6 ({G0, 91, 92}, 9)) 
=€-closure (((go, 0) U 8 (qi, 0) U 5 (2, 0)) 
=€-closure ({go} Ud U #) 
=€-closure ({qo}) 

5'(qo, 0)= {90, 91,92} 

5’(qo, 1) =€-closure (5 (€-closure (qo), 1)) 
=e-closure (65 ({q0, 91,92}, 1) 
=€-closure (((go, 1) U 8 (qi, 1) US (q2, 1)) 
=e-closure (¢ U {qi} U¢) 
=e-closure (q1) 
= {41,92} 

5’(qo, 2) =€-closure (5 (€-closure (qo), 2)) 
=e€-closure (5 ({G0, 41, 92}, 2) 
=e-closure (q2) 
= {q2} 

5’(q1,0) =€-closure (6 (€-closure (q1), 0)) 
=€-closure (5 ({q1, 42}, 0)) 
=€-closure (5(q1, 0) U 5 (q2, 0)) 
=e-closure (¢) 
=¢ 

5’(q1, 1) =e-closure (6 (€-closure (41), 1)) 
=€-closure (5 ({q1, 92}, 1)) 
= {41,92} 
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m 5'(gi,2) =€-closure (5 (€-closure (g1), 2)) 
=€-closure (5(q1, 2) U 5 (q2, 2)) 
=e-closure (q2) 
= {q2} 

w Also 5’(q2,0) = ¢ and 8’(q2, 1) =¢ 

m 5'(q2,2) =e-closure (5 (€-closure (q2, 2)) 
=€-closure (6 (q2, 2)) 
=€-closure (q2) 
= {q2} 


Thus, transition function 8’ is: 


Present Inputs 
0 1 2 
{90.91.92} {41,92} {492} 
@ {91,92} {42} 
) d {92} 


Table 5.12 State Transition Table 


Transition diagram: 


0 1 2 
i) ¥ 
f 0, 1 1,2 
—+@)—"—+@) 
We ee 


Figure 5.22. NFA without €-moves. 


EXAMPLE 5.3.2: Construct an equivalent NFA, without €-moves, for given automaton. 


a 


Figure 5.23. State Transition Diagram. 
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Solution: 
Given Ne = (Q, 2,5, qo, F) where 6 is given by 


x | Present Inputs 
0 1 € 
{7} @ 6 
{gi} 6 = {42} 
@ {qo} ¢ 


Table 5.13 State Transition Table 


Let N = (Q, 2’, 8’, go, F’). 
Compute F’: 


€-closure (qo) = {qo} 
F’=FU {qo} €-closure (qo) does not contain member of F 


> F={q) rT 
Compute 6’: 
m 5'(qgo,0) =€-closure (5 (€-closure (go), 0)) 
=€-closure (5 ({go}, 0)) 
=€-closure ({q1}) 
= {41.92} 


w 5’(qo, 1) =E-closure (6 (€-closure (go), 1)) 
=€-closure (6 (qo, 1)) 
=e-closure (¢) 
=¢ 

m 5’(q,,0) =€-closure (6 (€-closure (q1), 0)) 
=e-closure (5 ({91, 92}, 0)) 
=€-closure (q1) 
= {41,92} 

a i'(q1, 1) = {qo} 

gm 5'(q2,0)=¢ 

m 5'(q2,1) = {40} 
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Transition function 8’: 


x Present Inputs 


Q 0 1 
90 {q1,92} @ 
1 {91,92} {qo} 
q2 @ {go} 


Table 5.14 State Transition Table 


Transition Diagram: 


= 
—_—_—_ 


0 


i 


Figure 5.24. NFA without €-moves. 


EXAMPLE 5.3.3: Convert the following NFA-€ to NFA. 


a € 


cs eae ©) @ 
——— re aed 


Figure 5.25. State Transition Diagram. 


Solution: 


Let N = (Q, 2’, 5’, qo, F’). 
F’ = {0,1} 


Compute €-closure: 


€-closure (0) = {0, 1}, 
€-closure (3) = {1,3} 
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Equivalent Automata 


To compute 8’: 
8(0,a)={1,2} 8(0,b)=¢ 


8'(1,a) = {1,2} s'(1,b) =¢ 
§'(2,a) =o 5'(2,b) = {1,3} 
5/(3,a) = {1,2} 5'(3,b) =o 


Transition Function: 


Table 5.15 State Transition Table 


Transition Diagram 8’: 


as 


s a ee 
©) ee ee 


Figure 5.26. NFA without €-moves. 


EXAMPLE 5.3.4: Convert the NFA-€, given in figure 5.27, to NFA. 


Figure 5.27. State Transition Diagram. 
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Solution: 


Let N = (Q, 2’, 6’, qo, F’). 
F’ = {0,1} 
Compute €-closure: 
€-closure (0) = {0,1}; €-closure (1) = {1} 


€-closure (2) = {2,3}; €-closure (3) = {3} 
€-closure (4) = {1,4} 


To compute 8’: 


5/(0,a) = {12,3} 8/(0,b) =¢ 
8/(1,a) = {1,2,3} 8/(1,b) =¢ 
5/(2,a) = {1,4} -8'(2,b) = {1,4} 
§/(3,a)={1,4} 8'G3,b)=¢ 
5'(4,a) = {1,2,3}  6(4,b) = 


Transition table: 


Present Inputs 


b 
{1, 2, 3} d 
d 


{1,4} {1,4} 
{1,4} d 
{1,2, 3} Y) 


Table 5.16 State Transition Table 


© 
WN © 
M 
s) 


Note-3: 
Converting NFA-e’s to NFAs does not increase number of states, 
and converting NFAs to DFAs increases number of states to at 
most 2”. 
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Equivalent Automata 


Transition Diagram: 


Figure 5.28. NFA without €-moves for NFA in figure 5.27. 


[54 Brewis 


Define the equivalence of two finite automata. 

Prove that L is accepted by some NFA, iff, it is accepted by some DFA. 

Give the general procedure for conversion of NFA to DFA. 

Apply the subset construction method to the NFA, given in figure 5.29, and convert 
it to a DFA. Trace the resulting DFA on inputs-€ and abac. 


PSS 


a, b,c 


, b,c 
abac 


Figure 5.29. State Transition Diagram of NFA. 


5. Convert the following NFAs to respective DFAs. 


per, 
on 


Figure 5.30. State Diagram a NFA. 
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b. 
Cs> 
Figure 5.31. State Diagram for NFA. 
ic. 


Figure 5.32. State Diagram for NFA. 


6. Prove that for every NFA, with €-moves, there exists an NFA without €-moves, that 
accepts the same language. 
7. Convert the following NFA-€ to NFA. 


@) -@) +-@©L’ 
——+{ q, ) ———>| q, } ———> 


Figure 5.33. State Diagram of NFA-é. 
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6 Minimisation /Optimisation of DFA 


‘Enjoy the power of problem-solving techniques; get optimal results.’ 


Introduction 


In this chapter, we briefly sample a few additional topics in DFA, which are of interest. 


6.1 Optimum DFA 


It is possible to have more than one DFAs that accept the same language. Among these 
equivalent DFAs, it is often useful to find the smallest, i.e., the DFA with the minimum 
possible number of states. This is especially important, when DFAs are used for designing 
computer hardware circuits. 


6.1.1 Definition 


Minimisation/optimisation of a deterministic finite automaton refers to the detection of those 
states of a DFA, whose presence or absence in a DFA does not affect the language accepted 
by the automata. 

The states that can be eliminated from automata, without affecting the language accepted 
by automata, are: 


@ Unreachable or inaccessible states. 
a Dead states. 
@ Non-distinguishable or indistinguishable state or equivalent states. 


6.1.2 Unreachable States 


These are the states that cannot possibly be reached from the initial state. Unreachable states 
of a DFA are not reachable from the initial state of DFA, by any possible input sequence. 
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EXAMPLE 6.1.1: (Unreachable states) 
ry 
eo ee 
“Ob Lg cos sa 


1 we) 


b 


a 
Figure 6.1. State Transition Diagram. 


Here, state 5 is unreachable, from the initial state 0, with any input string (either b or a). 


Ca of 9 ma 
Co 2 @&) ° 
eK 6” 


Figure 6.2: State Transition Diagram. 


ii. 


Here, states q2 and q4 are unreachable, from the initial state go, with any input string 
(Oor1). 


6.1.3. Dead State or Trap State 


A state is dead, if it is not an accepting state and has no out-going transitions, except to 
itself. Alternatively, a dead state is a nonfinal state of a DFA, whose transitions on every 
input symbol terminates on itself. 


Formally, ‘gq’ is a dead state, if g is in Q and 5(q, a) = q for every ‘a’ in X. 
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EXAMPLE 6.1.2: (Dead state) 


Dead or Trap 
S' state 
a,b 


Figure 6.3. State Transition Diagram. 


Here, if the given input is ‘b’ at state go, then the machine enters state g3 and remains 
there for any input string (a or b). Hence, q3 is a dead or a trap state. 


6.1.4 Indistinguishable and Distinguishable States 


States are said to be indistinguishable, if their merger does not change the language accepted 
by a DFA; otherwise the states are distinguishable. 


Formally, they are defined as, 


@ Indistinguishable : Two states p and q of a DFA are called indistinguishable, 


if 5(p,w) € F => 5(q,w) € F 
and 5(p,w) ¢ F = > 5(q,w) ¢ F, Wwe &*. 
@ Distinguishable : Two states p and q of a DFA are called distinguishable, 
if 5(p,w) € F andd(q,w) ¢F 
and 5(p,w) ¢ F and 5(q,w) € F, Vwe =”. 


In other words, two states p and q of a DFA are equivalent or indistinguishable, if every 
string w leads from p to a final state, if and only if, it also leads from g to a final state. 


To show that two states are non-equivalent or distinguishable, we need to find only 


one string that leads from one of them to a final state and leads from other to a non-final 
state. 
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EXAMPLE 6.1.3: (Indistinguishable and distinguishable state) 


0 
 ——— 
~~G) 


Figure 6.4. State Transition Diagram. 


i. 


Consider two states of the DFA, go and q1, with O = {go, 91, 93,95} and F = {q3, qs}. 


ws Input =1, 6(¢0,) =qeEF 
6(q1,1) =q3 € F 
ws Input = 0, 5(¢0,0)=q, ¢ F 
5(q1,0) = qo ¢ F. 

Since for every string 1, 1 leads from qo to the final state g3 and also 1 leads from q; 


to the final state g3, the two states (qo, g1) are called indistinguishable or equivalent 
states. 


He) le) 
oe 


lo \ ra 


Figure 6.5. State Transition Diagram. 


a. Consider two states of DFA (figure. 6.5), gi and q2, with OQ = {go.q1. 92.43, 
94,95}, F = {q3} and & = {a,b}. 


@ Input =), 5(q91,b)=q5¢F 
5(q2,b) = 45 ¢ F. 


> 


Since for every string ‘a’, ‘a’ leads from q; to non-final state gs; also ‘a’ leads from 
q2 to a non-final state qs. ‘Hence, the two states (q¢1, q2) are called equivalent states. 
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b. Consider the two states, gq. and qs of the above DFA, with Q = 
{90,915 92, 93,9495}, F = {q3} and & = {a,b}. 
@ Input =), 5(q2,b)=95 ¢ F 
(95,6) = q3 € F. 
@ Input =a, 5(q¢2z,a)=q¢F 
5(45,4) = qo ¢ F. 


Here, only one string ‘b’, with state qs, leads to the final state g3 while the other string 
leads to a non-final state. Hence, g2 and qs are non-equivalent or distinguishable. 


6.1.5 Procedure to Detect Indistinguishable State 


: Partition x = (F,Q — F), where F is the set of final states and Q is 
the set of states. 

: Perform partition 7pew into subsets of 7 such that, 

. For any input symbols a and d, if a transition is made to states of the 
same subset of z or final states, then no partitions are made further. 
Go to step-5. 

. For any input symbol a and 5, if a transition is made to a state of 


different subset of 7, then the subset is further partitioned. 

Tw = Rnew 

Repeat step-1 through step-3, until tnew # 7. 

If members of the subset m7 on input string ‘a’ or ‘b’ go for the 
same transition states (either final or non-final), then the states are 
indistinguishable. 


EXAMPLE 6.1.4: Identify the indistinguishable states from the DFA, shown in figure 6.6. 


a 


Figure 6.6. State Transition Diagram. 
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ar= ({2, 3, 4, }, (1,5) 


Partitions: |2 3 4 


Form pew, With subset {2, 3, 4}: 


@ Input=a, 6(2,a)= 
6(3,a) =5 
6(4,a) = 4. 


Since for the input a, state 3 gives a transition to a state of other member in the subset 
{1,5}, 3 is partitioned. 


Tew = ({2, 4}, {3}, (1, 5})) 
aH = Mnew- 


b. a = ({2,4}, {3}, (1,5}) 


rewime F] EE] 3] 


Form anew, with subset {2, 4}: 
m Input =a, 5(2,a)=4 


6(4,a) = 
@ Input=b, 6(2,b)=1 
6(4,b) =4 
Since for the input b, state 2 gives a transition to a state of other member {1, 5}, 2 is 


partitioned. 
Mew = ({2}, {4}, {3}, {1,5}) 


wu = Nnew 


c. w= ({2}, {4}, {3}, (1, 5}) 


Partitions: 


Form Zpew, With subset {1, 5}: 
@ Input=a, 5(1,a)=3 
6(5,a) = 3 
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@ Input=b, 6(1,b)=2 


6(5, b) = 2. 
Since for the inputs a, b, states 1 and 5 give the transition to the same state, they are 
not partitioned. 
Hence 


new = ({2}, {4}, {3}, {1, 5}) 
=>Nnew = 7 is true. 1.€ Mnew # 7 is false. 
*, stop the iteration. 


Thus, states 1 and 5, on the inputs a and b, give transitions to same non-final states. 
Hence, | and 5 are called indistinguishable states. 


6.2 Minimal DFA 


A DFA is minimal, if and only if, 


a. all its states are reachable from the start state, 
b. all its states are distinguishable. 


6.2.1 Minimisation Algorithm for DFA 


Let M = < Q,%,q0,5,F > be a DFA that accepts a language L. Then, the following 
algorithm produces the DFA, that has the smallest number of states among all the DFAs, 
that accept L. 


Step-1: Identify all unreachable or inaccessible states and eliminate them from the 
DFA, M. 


Step-2: Identify all indistinguishable states from the DFA and merge them all to form 
the DFA with smallest number of states. 


Step-3: Construct a DFA from zfna). 
Step-4: END. 


The following is the procedure to detect an indistinguishable state: 


@ Construct a partition z = {F,Q — F} 
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a do 

new = Make-partition (7) 

X = Mhew 

while (pew # 1). 

Mfinal = 7 

@ Ifthe members of subset 7 on an input string go for same transition states, either final 
or non-final, then the states are indistinguishable. 
Make-partition is a function as follows: 


Function Make-partition(z ) 
For each set S of 1 do 
Begin 


Partition s into subsets such that two states p and q of s are in the same 
subset, iff, for each input symbol a or b, p and q make a transition to the 
same states of s or final states. Otherwise, no partitions are made. 

End. 


EXAMPLE 6.2.1: Minimise the DFA. 
0, 1 


OO 
«earn 


Y 


Figure 6.7. State Transition Diagram. 


Solution: 


Unreachable state: If we enumerate all simple paths starting from the initial state, then 
we find that the state gs5 is said to be unreachable. So, it is removed. 
Indistinguishable state: 


a r= ({40, 91, 92}, {q3,44}) 
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Form 2yew, with subset {go, 91, g2}. 
B Input =0, 5(g0,0)=q1 


5(q1,0) = q2 
5(q2,0)=¢ 
a Input =1, 5(90,1) = q 
6(q1,1) = 43 
5(q2, 1) = qa. 


Since for the input ‘1’, states g, and q2 give transitions to the states of other member 
{q3, 94}, {q1, 42} are partitioned. 
=> new = ({go}, {q1, q2}, {q3, q4}) 


wT = Whew. 


b. m = ({go}, {91,92}, {93,94}) 


Form 1new, With subset {q1, q2}: 


m@ Input =0, 6(q1,0)=q 
5(q2,0) = 
@ Input =1, 6q,D=qa¢«EF 
§(q2,1I) =q4 EF 
On input ‘1’, g and q2 give transitions to states of other members {q3, q4}. Also, both g3 and 
qa belong to final states, thus g; and q2 are not partitioned. Hence, they are indistinguishable. 
Forming 2pew with subset {q3, q4}: 
@ Input=0, 4(q3,0) = q3 
5(q4,0) = 
@ Input=1, 6q@m,D=q@aeEeFrF 
5(q4, 1) = 44 € F. 
Since for the input ‘1’, g3 and qq belong to the same subset members, they are not partitioned 
and are indistinguishable. 
=> 1 £ MnewiS false. 


Merging of all indistinguishable states leads to a minimum DFA, given in figure 6.8. 
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Figure 6.8. Minimised DFA. 


EXAMPLE 6.2.2: Reduce the DFA given in figure 6.9. 


0 0 
0, 1 
1 A) 
ao) 
1 fo 
1 


Figure 6.9. State Transition Diagram. 


— 


0 


Solution: 


Unreachable states: There are no unreachable states in the above DFA. 
Indistinguishable states: 


a4nrer= ({q0; 41> 42; q3}, {q4}) 


Partition: qo Qi 42 


Forming pew with subset (go, 91, 92, 93} : 


a Input =0, 5(¢0,0)=q a Input=1, d(¢o,1) = 43 
6(41,0) = q2 6(q1,1) = 44 
6(q2,0) = 41 6(q2, 1) = 44 
5(q3,0) = q2 5(q3, 1) = qa. 


Since for the input ‘1’, g1,q2,q3 give transitions to the members of other subset 
{94}, (91.92, 93} are partitioned. 
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=> Trew = ({Go}, {91.92.93}, {ga} 


T = new. 


b. a = ({qo}, (41,92, 93}, {ga}) 


Forming 7pew with subset {q1, 2, q3}: 


@ Input=0, 4(91,0)=q ws Inpu=1, 6q,D=qeF 
6(q2,0) = q1 6(q2,1)=q4EF 
5(q3,0) = q2 5(q3, 1) = 44 € F. 


Since for the input 1, g1,q2 and q3 give transitions to members of other subset 


{q4} which is the final state, g1,q2 and q3 are not partitioned. Hence, they are 
indistinguishable. 


Therefore, new = ({G0}, (91,92, 93}, (4}) 
=> ww Mew is false. 


Merging of all indistinguishable states results in a minimum DFA, given in figure 6.10. 


WS & 
OI ©) 


Figure 6.10. Minimised DFA. 
EXAMPLE 6.2.3: Reduce the DFA given in figure 6.11. 


0 1 
_start_ @y, o («1 ) OQ } 
(yy 


Figure 6.11. State Transition Diagram. 
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Solution: 


Unreachable state: By enumerating all simple paths from the initial state, we find that 
both gz and qq are not a part of such paths. Therefore, gz and g4 are unreachable states and 
are removed. 


Indistinguishable states: 
a. 7m =({qo, 91}, {93,.95}) 


Partition: 


Forming Zpew with subset {go, qi} : 
@ Input=0, 5(¢0,0) =q 
5(q1,0) = go 
@ Input=1, 5(¢.,1)=qEF 


6(m,IJ= EF 
On the input ‘0’, g; and go give transitions to the same members of subset, while with 
input as ‘1’, they give transitions to q3 (which is the final state). Hence, they are not 
partitioned and are indistinguishable. 


Further, forming pew with subset {q3, q5} : 


@ Input=0, 5(93,0)=q95¢eF 
6(q5,0) = 95 € F 
@ Input=1, 6(q@3,I) =q5EF 
6(95,1)=495€F 


Since both q3 and qs are members of same subset (both belong to the final set), they 
are not partitioned and are indistinguishable. 


Therefore, new = ({90, 91}, (93, 95}) 
a # New is false. 


Merging of all indistinguishable states leads to a minimum DFA, given in figure 6.12. 


—<—) 0 S\) 0.1 
start 1 
> —————_—_———— 


Figure 6.12. Minimized DFA. 
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Minimisation/Optimisation of DFA 
EXAMPLE 6.2.4: Reduce the DFA given below: 
o*, b ab 


Figure 6.13. State Transition Diagram. 


Solution: 
Unreachable states: no unreachable states. 
Indistinguishable states: 


a. mw =({2,3, 4}, (1, 5}) 


Partitions: 


Forming pew with subset {2,3,4}: 
@ Input=a, 5(2,a)=4 


6(3,a) =5 
6(4,a) = 
Since for the input a, 3 gives transitions to a state of other member {1,5}, hence 3 is 


partitioned. 
Therefore, new = ({2, 4}, {3}, (1, 5}) 
i = Nnew 


b. w = ({2, 4}, (3}, {1,5)) 


Partitions: 


Forming Zpew with subset {2,4}: 
@ Input =a, 6(2,a)= 
6(4,a) = 


217 


Downloaded from https://www.cambridge.org/core. Stockholm University Library, on 06 Dec 2018 at 08:00:45, subject to the Cambridge Core terms of use, available at 
https://www.cambridge.org/core/terms. https://doi.org/10.1017/UPO09788175968363.007 


A Textbook on Automata Theory 


a Input = bd, 6(2,b)=1 
6(4,b) = 
Since for the input b, 2 gives a transition to a state of other member {1,5}, 2 is 
partitioned. 


Therefore, new = ({2}, {4}, {3}, {1,5}) 
wT = Tew: 


c. w= ({2}, {4}, {3}, (1,5) 


Partition: 


Forming 7pew with subset {1,5}: 


a Input =a, d(1,a)=3 


5(5,a) = 3 
@ Input = b, 6(1,b) =2 
6(5,b) =2 


Since for inputs of a, b, the members of subset {1,5} give transitions to the same state 
and both 1 and 5 belong to the final state, no partition is done. 


Mnew = ({2}, {4}, {3}, {1,5}) 
=> new # 7 is false. 


Thus, {1,5} are indistinguishable. 


Merging of all indistinguishable states leads to a minimum DFA, given in figure 6.14. 


ax 
aw 


Figure 6.14. Minimised DFA. 
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EXAMPLE 6.2.5: Minimise the number of states of DFA in figure 6.15. 


start ——_-» oO 


Figure 6.15. State Transition Diagram. 


Solution: 
Unreachable states: No unreachable states. 
Indistinguishable states: 


a. w = ({3}, {1, 2,4, 5, 6}). 


Partition: | 3 | 1245 6 


Forming pew with subset {1,2,4,5,6}: 
@ Input=a, 5(1,a)=2 a Input=b, 60,5) =3 


8(2,a) =2 5(2,b) =4 
8(4,a) =6 5(4,b) =3 
8(5,a) =5 8(5,b) = 3 
8(6,a) =5 8(6,b) =4 


Since on the input of b—1,4 and 5 give transitions to members of other set {3}, (1,4,5} 
are partitioned. 


Therefore, new = ((3}, (1,4, 5}, {2, 6}) 
HT = Mnew- 


b. a = ({3}, (1,4, 5}, (2, 6}). 
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Partition: 


Forming pew with subset {1,4,5}: 
@ Input=a, 5(1,a)=2 
5(4,a) = 6 
6(5,a) =5 


Since 1 and 4 on the input ‘a’ give transitions give to members of other subset {2,6}, 
partition is done. 


Therefore, pew = ({3}, {1,4}, (5}, {2, 6}) 


1 = Thew- 


c. w = ({3}, (1, 4}, (5), (2, 6}). 


Partition: 


Forming 7pew with subset {1,4}: 


@ Input=a, 5(1,a)=2 
5(4,a) =6 


Since | and 4 give transitions to members of other subset, partition is performed. 
Hence, new = ({3}, {1}, {4}, {5}, {2, 6}) 


= Mnew- 


d. mw = ({3}, {1}, {4}, {5}, (2, 6}) 


Partition: 


Forming 7pew with subset {2,6}: 


ws Input=a, 6(2,a)=2 


5(6,a) =5 
ws Input=b, 6(2,b)=4 
5(6,b) = 4 
Since on input a, state 6 gives transition to member of other subset, the partition is 


done. 
Mnew = ({3}, {1}, {4}, {5}, (2}, {6}). 


wT = Nnew- 
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Since there are no more subsets in 7, to partition further, no indistinguishable states 
are present. 

Hence, number of states of given DFA is already minimum and it cannot be 
reduced any further. 


EXAMPLE 6.2.6: Reduce the DFA in figure 6.16. 


ao ®D) 
Oe 


eC) 


Figure 6.16. State Transition Diagram. 


Solution: 
Unreachable states: No unreachable states. 
Indistinguishable states: 
a. m = ({p}, {9,r}). 
Forming pew with {q, r}: 
ws Input=1, 6q¢,l)=reF 


o(r,l)=qeF 
B Input=0, 6(q¢,0) = 
d(r,0) =P 


Since on the input 1, both r and q belong to the same member states (which are final) and 
with input 0, both r and gq give transitions to the same member of other subset ‘p’, which is 
non-final. Thus, {g, 7} are not partitioned and are indistinguishable. 


Therefore, new = ({p}, {g,7}) 
=> 1 F# Mew is false. 


Merging of {q, r} leads to a minimum state DFA, given in figure 6.17. 


—OC ©? 


Figure 6.17. Minimised DFA. 
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6.3 Two-Way DFA 


As discussed earlier, a one-way DFA (1DFA) is a mathematical model of a machine with 
a finite amount of memory, where the input is processed once from left to right. After the 
input has been read, the DFA decides whether the input is to be accepted or rejected. 

A two-way DFA (2DFA) consists of a finite state control and a read-only input tape that 
allows an input to be read back and forth. As in case of DFA, the 2DFA decides whether a 
given input is to be accepted or rejected. 


6.3.1 Definition 


A mathematical model of a machine, with the ability of a read-head to move left as well as 
right is a two-way DFA. 


6.3.2 Elements of 2DFA 
The 2DFA exhibits the following five characteristics: 


a finite set of states Q, 

an alphabet & of possible input symbols, 
a transition function 6 

the initial state qq € Q 

the set of final states (F), where F C Q. 


i ll SO 


Note-1: The term 2DFA refers to the fact that on each input, read-head is allowed to read 
back and forth. 


6.3.3 How 2DFA Operates 


The 2DFA operates in the following manner: initially, the 2DFA is assumed to be in the 
initial state go with its read-head on the leftmost symbol of the input string. During each 
move of 2DFA, the read-head moves one position to the right or one position to the left, 
depending on the transitions defined for a given DFA. The 2DFA is then said to have accepted 
an input string, if it moves the read-head off the right end of the tape, entering an accepting 
state simultaneously. 
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6.3.4 Ordered Quintuple specification of 2DFA 


Formally, a 2DFA M is a five-tuple M = (Q, 2,4, qo, F) 
where, 


. Qisa finite set of states, 
= is a finite set of input symbols, 


. 6:0xxrU—> Ox {L,R}, 
- go & Qis the start state, 
. FC Qis the set of accept states. 


The transition function 6 is a map from 
Qxi—>Q~x {L,R}. 


m If5(q1,a) = (q2,L), then on reading the input symbol ‘a’ the 2DFA enters from state 
qi to state q2 and moves its read-head left by one square. 

m if 5(q1,a) = (q2,R), then on reading the input symbol ‘a’ the 2DFA enters from state 
qi to state gz and moves its read-head right by one square. 


6.3.5 Description of a2DFA 


The transitions of 2DFA can be represented, using the transition diagram or table. 
Transition diagram: For a directed graph, an arc going from one vertex (which corresponds 
to the state P) to another vertex (that corresponds to the state q) and also the edge label, can 
be represented in different forms as follows: 


a. Form-1:| input symbol (a)—> Left(L) or Right(R). 
a7L 
Opa ©. 


Figure 6.18. Edge Label format. 


b. Form-2:} input symbol (a)/Left(L) or Right(R) 
a/R 
O)FA+© 
Figure 6.19. Edge Label format. 
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Transition table: The description of how a 2DFA operates for a given set of symbols on 
the tape can be represented in a tabular format called transition table. Different table forms 
are as follows: 


Form-1: Current State | Input Symbol 


Table 6.1 Transition table to represent 2DFA. 


Form-2: 


(S;, Move) 


a2 eee an 
| 
= 


* Move = Lor R. 
Table 6.2 Transition table to represent 2DFA. 


EXAMPLE 6.3.1: Consider a 2DFA M = (Q, 2,5, qo, F) where 
Q = (40,91,92,93), 


x = {0, 1}, 
qo = {40}, 
F = {qi} 


and 6 (using the transition diagram and table (in different forms)) is described below: 


Transition diagram: (using form-2) 


0/R 0/R 1/L 1/R 
oO” aC 
0/R 


Figure 6.20. Transition diagram 2DFA. 
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Transition table: (using form-1) 


Current State | Input Symbol | New State | Move 
qo 
q1 
1 R 
q2 L 
3 R 
L 
R 
R 


an 


q2 
90 
q3 


Table 6.3 Transition table using form-1. 


ak 


Transition table: (using form-2) 


(qo, R) (q1 ’ R) 
(q1,R) (q2,L) 
(93, R) (92, L) 
(go, R) (q3, R) 


Table 6.4 Transition table using form-2. 


6.3.6 Instantaneous Description of a 2DFA 


To describe the behavior of a 2DFA for a given input, an instantaneous description (ID) has 
been introduced that describes 


@ the input string 
® current state, and 
@ current position of read head. 


An ID of a 2DFA is a string xqy, where q is the current state, xy are the input symbols from 
the tape and the read-head points to the first character of the substring y. The initial ID is 
denoted by gxy, where q is the start state and the head points to the first symbol x (from 
left). The final ID is denoted by xyge, where ‘e’ indicates that the head has moved off the 
right end of the input. 

Formally, the relation E on ID is defined as 


i bh iff D can go from J; to Jz in one move and D is a string in D*QE*. 
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EXAMPLE 6.3.2: Consider a 2DFA, M = (Q, 2, 5,qo0, F), where, OQ = (90, 91,92; 93, 94); 
x = {0,1}, 90 = go. F = gz and 4 is given by: 


States ® : 
qo (go,R) (q1,R) 
41 (q1,R) (q2,R) 
q2 (q2,R) (q3,L) 
B (q4,L) (q3,L) 
94 (go,R) (q4,L) 


Table 6.5 Transition table for 2DFA. 


The action of a 2DFA with the string w = 1001 is shown below: 


mob PE] fbb 
I, t. 
ID: qo1001 ID: 1q)001 
PD PD] =f EET 
f, 1 


ID: 10q,01 ID: 100q,1 


od oe 


ID: 1001q,e 42 


Figure 6.21. ID of the 2DFA for the string w = 1001. 


Formally, 
ID :qq1001 + 1q;001 
F 104,01 
F 100911 
F 1001q2e. 
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6.3.7 Language accepted by a 2DFA 


If the read-head moves to the right end of the tape, entering an accepting state, a 2DFA is 
said to have accepted the input string. The language accepted by 2DFA, M can be defined 
as 


L(M) = {w/qow & wg for some geF}. 


For the above example, for the input string w = 1001, the read-head falls at right end of the 
string entering an accepting state. Thus, the string 1001 is accepted by the above 2DFA. 


EXAMPLE 6.3.3: Consider a 2DFA, M = (Q, 2,5, qo, F), where Q = {qo, 91, 92; 93594; 95> 
96,97, 98,99}, & = {0, 1},90 = go, F = qo and 4 is given by: 


Current State | Input Symbol | New State | Move 


90 0 qo R 
qo 1 q1 
qN [ Oorl q2 


q2 Oor 1 


Table 6.6 Transition table 2DFA 


Show that the input string 11001010 is accepted by the above 2DFA. 
Solution: For the given input string w = 11001010, we need to show that the read-head of 


the 2DFA moves to the right end of the tape entering an accepting state go. We show this 
using the ID of a 2DFA. 


ID: qo11001010 + 1q,1001010 
+ 1192001010 
+ 1109301010 
+ 1100941010 
+ 11001g5010 
+ 1100951010 
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F 1109701010 
F 11¢q3001010 
F 1q91001010 
F 11q,001010 
F 110q201010 
Fk 1100q31010 
F 1100144010 
- 110010g510 
F 1100101¢g90 
t 11001010goe. 


Since the read head falls at the right end of the string, entering an accepting state, the given 
string is accepted by the above 2DFA. 


6.4 DFA vs. 2DFA | 


a. DFA and 2DFA differ in the sense that an input can be read only once from left to right 
by a DFA, whereas a 2DFA can read the input back and forth with no limit on number 
of times an input symbol can be read. 

'b. Transition function 

For a DFA, ‘8’ is defined as5:Q x 2 > Q 
For a 2DFA, ‘5’ is defined as 5: Q@ x X > Q x {L, R}. 

c. 2DFA is indeed a very meaningful and arguably more realistic model than a DFA. 

d. ADFA is the restricted version of a2DFA. Therefore, a2DFA a can solve any problems 
that are solvable by DFA. 

e. 2DFA can be significantly simpler in design than corresponding DFA used for solving 
the same problem. 


1. Define optimum DFA. 

2. Define the following: 
a. Distinguishable and nondistinguishable states. 
b. Unreachable and dead states. 
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3. Obtain the indistinguishable states from the DFA, given in figure 6.22. 


Figure 6.22. State Transition Diagram. 


4. Give the general procedure to minimise the states of DFA. 
5. Reduce the DFA’s given below: 


b 
= (+) : : 
——__—> ———_—_> —— (D) 


Le 
Qe O 


Figure 6.23. State Transition Diagram. 


6. Identify all nonreachable states from the DFA, given in figure 6.24. 
7. Reduce the following DFA, where q is the start state and gg is the final state. 
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— 
start =OLO+ 07, 
i ee — 


Figure 6.24. State Transition Diagram. 


8. Identify the unreachable and indistinguishable states from the DFA, given in figure 
6.25. 


a (a) @) 8 
— oats a 


| avai 


Figure 6.25. State Transition Diagram. 


9. Present the formal definition to a 2DFA. 
10. Explain, with example, instantaneous description of a 2DFA. 
11. Consider a 2DFA M = (Q,%,6,q0,F) where go = {qo},F = {go,q1,q2} and 4 is 


given by 
Current State | Symbol | New State | Move 
qo 0 qo R 
qo 1 1 R 
m1 0 q1 R 
11 1 q2 L 
q2 0 qo R 
q2 1 q2 L 


Table 6.7 Transition table 
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Show that the given 2DFA is not accepted and will run in an infinite loop for the input 
string 10111. 


12. Bring out the differences between DFA and 2DFA with examples. 
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Finite Automata and Regular 
Expressions 


‘Enjoy the beauty in the development of the subject through expressions.’ 


Introduction 


Regular expressions have an important role in computer science applications. For example, 
in applications involving text, if users are interested in searching for strings that satisfy 
certain patterns, then regular expressions certainly provide a powerful method for describing 
such patterns. Further, utilities like AWK and GREP in UNIX, text editors and modern 
programming languages such as PERL etc., provide mechanisms for the meaningful 
description of patterns using regular expressions. In chapter 3, regular expressions and 
languages were discussed in detail. In this chapter, equivalence of DFA and NFA with 
regular expressions is discussed. We briefly recall the definition of a regular language as the 
language represented by a regular expression. If r is a regular expression, then L(r) denotes 
the language associated with r. This language is defined, formally, as follows: 


@ ¢ is aRE denoting the empty set. 
@ cis aRE denoting {€}. 
@ Forevery a € X,ais a RE denoting {a}. 


7.1 Properties of Regular Sets or Regular Languages 


7.1.1 Closure Properties 


Theorem I: Regular Languages are closed under union, concatenation and kleene closure 
(closure properties) 


Proof: Method-1 


Given two regular languages — L and L, if Rj and R2 are two REs, then by the definition 
of RE, 
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. Ry + Ro, denotes the union of two regular languages L; and Ly > L; U L» is also 
regular. 


. Ry - Ro, denotes the union of two regular languages L; and Lz => LjLz is also 
regular. 


c. Rj denotes the kleene closure of L; = Lj is also regular. 


EXAMPLE 7.1.1: (method 1) 
(i) If R, and Rz are REs with Rj = {ab,c} and R2 = {d, ef}, then 
=> R, UR» = {ab,c} U {d, ef} 
= {ab,c, d, ef} 
=> R,R2 = {ab,c} - {d, ef} 
= {abd, cd, abef , cef}. 
If R = {ab,c} 
=> R* = {€,c, abab, abc, cab, cc, ...}. 
Gi) Ifr=a-+b, then the language of r is 
L(r) = L(a+b) 
= L(a) UL(b) 
= {a} U {5} 
=> L(r) = {a, b} is also regular. 
(iii) If r = ab*a, then the language of r is 
L(r) = L(ab * a) 
= L(a) - L(b*) - L(a) 
= {a} - {L(b)}*{a} 
= {a} - {€,b, bb, bbb, ...} - {a}. 
=> L(r) = {aa, aba, abba, .. .} is also regular. 
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(iv) If r =a%*, then the language of r is 
L(r) = L(a*) 
= (L(a))* 


=> L(r) = {€,a,aa,...} is also regular. 


Proof: Method-2 


Assume that FA; and FA? are two finite automata’s accepting languages L; and Ly, 
defined by regular expressions r; and r2, respectively, as 


FA; = (Q1, 41,51,91,f1), FA2 = (Q2, X2, 52, q2,f2). 


Case 1: Construction of L; + Lz 


Figure 7.1. Automaton for L; + Lp. 


Case 2: Construction of L;L2 


Figure 7.2. Automaton for L\Lp. 
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Case 3: Construction of LT 


Original Original 
start final 


Figure 7.3. Automaton for Ly. 


EXAMPLE 7.1.2: (method 2) 
Let & = {a,b}, L = all words ending with a. 


Lz = all words containing substring aa. 


r} =(a+b)*a, r2 =(a+b)*aa(a+b)*. 


Ca), ©: 


Figure 7.4. FA, for r,. 


ce), O=-@©D~ 


Figure 7.5. FA2 for ro. 


(i) ry tro = (at+b)*a+ (a+ b)*aa(a + b)*. 
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Figure 7.6. FA forr; +12. 


(ii) rir, = (a+ b)*a- (a+b)*aa(a + b)* 


FOLIOS. e On "O-+-@ 


new start new final 


Figure 7.7. FA for rir. 
(ili) rj = ((a + b)*aa(a + b)*)* 


€ 


Se 
ee 


original original 
start € final 


Figure 7.8. FA for r2x. 


7.1.2 Complementation of a Regular Language 


As discussed earlier, complement of a language L is a language L such that, 
L=%*-L. 
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For example, if & = {a,b}, L = {a, b, aa} then L is as follows: 
L = {é,a,b, aa, bb, aaa, bbb, ba, bab, abbab . ..} — {a, b, aa}. 
L = {é, bb, aaa, bbb, ba, bab, abab, .. .}. 


Theorem II 

Regular languages are closed under complementation. 

TPT: If L is a regular language and L © *, then L = D* — L, is also regular. 
Proof: 


Suppose, L is a regular language over the alphabet ©, then there is a DFA, D = 
(QO, 2,6, go, F), accepting L = L(D). 


To accept L=>' <i, complement the final states of D. For this, consider another DFA, 


D =(Q,,6,q0,F’), 


where, F’ = Q — F, i.e., D and D’ differ only in their final states. 
Now, if ‘x’ is an input string/word, then x € L(D’). 
=> 5(qo,x) € F’ 
=> 5(go.x)EQ-F 
= 5(qo,x) ¢ F 
=>xéL 
=>xeL 
=>xex*—-L. 
Therefore, L(D’) = L = D* — L. 
Hence, L is a regular language since it is accepted by a DFA. 
This can be explained as follows: 


Consider a DFA D, accepting a language L which is regular. Now, create a DFA D’, 
that accepts the language L, as follows: 


a. D’ has same states and arcs as D. 
b. Every final state of D becomes a non-final state in D’. 
c. Every non-final state of D becomes a final state in D’. 


=> nonfinal states <— final states 


Thus, the resulting DFA, D’ accepts L, which is a regular language. 
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EXAMPLE 7.1.3: (Complementation) 


If & = {a,b} and L = all words with length of atleast 2 and second letter as b and L = all 
words with length less than 2 or second letter as a, then the change of all non-final to final 


and all final to non-final states of DFA, D gives D’, that accepts L. 
= ©" 
_ ab 2 


Oyo 


Figure 7.9. DFA D 


=. @)—— i (@) 
——_—_ a er 


Figure 7.10. DFA D’ 


EXAMPLE 7.1.4: Give examples for closure properties of regular languages. 
Let ZL) = {a"b}, Ly = {ba}. The transitions of L; and Lz are shown in figure 7.11 and 
figure 7.12 respectively. 


Figure 7.11. M, for Ly i.e. L) = L1(M)) 
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@ 8 ©. 
—— —_—__ ————_>| 

Figure 7.12. Mo for Ly i.e. Lz = L2(M2) 
@ NFA for ZL; U Ly is: 


Figure 7.13. L, UL2 = {a"b U ba} 


mw NFA for L; - L is: 


Figure 7.14. Ly, -L2 = {a" - b} - {ba} = {abba}. 


m NFA for L} is: 


Figure 7.15. L* = {a"b}* 
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7.1.3 Intersection of Two Regular Languages 


Intersection of two regular languages L; and L> is defined as L; N Lp. 


Theorem III 
Regular Languages are closed under intersection. 
T.P.T: If L; and Lo are regular languages, then L; M Ly is also regular. 
Proof: 
Given : Two regular languages L; and L2. 
Applying de Morgan’s law, 
Li NL, =1, VI. 

Since Ly, L is regular, 

=> L1,L is regular (-.. regular sets are closed under complementation) 


= L; UL} is regular (-." regular sets are closed under union) 


=> L, UL} is regular (-." regular sets are closed under complementation) 


=> L; NL, is regular (by de Morgan’s Law). 


Hence, proved. 


7.2 Arden’s Theorem 


Let P and Q be two regular expressions over ©, and if P does not contain €, thenR = Q+RP 


has a unique solution given by 


IfQ@=e then, R=Q+RP 


=> R=eE+RP 
= R =e P* 
=> R= P*, 


This theorem is used to find a RE, which can be recognised by transitions. 
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7.3 Equivalence of Finite Automata and Regular Expressions 


It is interesting to note that regular expressions and finite automata are equivalent in their 
descriptive power, although superficially they appear to be rather different. However, any 
regular expression can be converted into a finite automaton that recognises the language it 
describes and vice-versa. At this point, it can be recalled that a regular language is one that 
is recognised by some finite automaton. 

The languages accepted by finite automata, are the languages denoted by regular 
expressions. In other words, we can say that regular expressions define the class of languages 
that are accepted by finite automata. This implies that: 


a. every language accepted by a finite automaton can also be defined by a regular 
expression, 
b. for every regular expression, there is an equivalent NFA with €-transitions. 


7.4 Cycle of Constructions 


Regular Non-deterministic Deterministic Minimal 
expression finite automata finite automata deterministic 
finite automata 
(RE) (NFA) (DFA) (MDFA) 


Figure 7.16. Cycle of Construction. 


NFA — DFA (subset construction) 

DFA — Minimal DFA 

RE — NFA (build an NFA for each term; combine them with €-moves) 
DFA — RE 


7.5. Equivalence of DFA and Regular Expressions 


If L is accepted by a DFA, then L can be expressed by a regular language. In other words, 
for any given DFA, we can obtain a RE and vice-versa. 
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Following are the two methods of constructing RE from a given DFA. 


a. By solving equations. 
b. By using transition diagrams. 


7.9.1 Construction of RE from given DFA by Solving Equations 


The following are the assumptions made, regarding the transition diagrams: 


. the diagram must have €-moves. 


a 
b. It has an initial state, say q1. 


9 


It has vertices g1,q2,...n- 


a 


qi is the RE representing the set of strings, accepted by the system through gq; (the 
final state). 


Following are the steps required to construct a RE: 


For each of the states 1, . . . gn, in DFA, write down the equations by considering 
all edges, that enter into that state. 


For the initial state of DFA, the equation is added with €. 


Compute the equation for each state. 


Substitute the results of each state equation into the final state equation of DFA, 
to get a RE for DFA. 


EXAMPLE 7.5.1: Construct a RE corresponding to the DFA, represented by table 7.1. q; is 
both the initial and the final state. Transition table is given below: 


x Present inputs 
Q 0 1 
91 71 q2 
Q2 93 q2 
93 q1 2 


Table 7.1 State Transition Table 
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Solution: 


(0 () a 
= 7 
rn rr area neem emmeeeee ont 


Figure 7.17. Transition Diagram. 


Now, the equations for each state can be written, by considering all edges that enter into 
that state. 


Thus, 
gq =q10+ 430+ € (€, only for initial state) 
gz =qil+qo1+q31 
q3 = 920. 


Substituting 3 in q2 


>q=qi+ql+q01 
=ql+q(Ui+01) 
=q1(1+01)* (¢.R=Q+RP is R= QP") 
Now, 
qi = 910 + g30+ € 
= q10+q20-0+€ 
= qi0+ qi1(1 + 01)* - 00+ € 
=e (04+ 1(1 + 01)* - 00)* 
(.R=Q+RP => R= OP*) 
qi = (0+ 1(1 + 01)* - 00)* 
Since, q; is the final state, RE for DFA will be 


RE = (0+ 1(1 +01)*00). 
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EXAMPLE 7.5.2: Construct a RE for the DFA in figure 7.18. 


AR AH De 
s_@)—_-@) 0 (<) 


Figure 7.18. Transition Diagram. 


Solution: Writing the equations for each state, by considering all edges that enter into that 


state: 
A=A0+ € 
B=A1+8B1 
C = B0+C(0+ 1) 
Since, 
R=Q+RP>R=QP". 
Consider, 


A=A0+¢€, =A=eE0* (¢:R=A, QO=e, P=0). 
A= 0*. 
Now, substitute A in B = A; + B, 
>B= 0*1 +B, 
B=0*11* (sR=Q+RP=>R=QOP*). 
Since, A and B are the final states in the above DFA, the addition of A and B gives, 
>A+B=0* +0*11* 
= 0*(€ +11*) 
= 0*(€ +11*) 
= 0*1* ¢C.. € +RR* = R*) 
=> RE = 0*1*. 
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EXAMPLE 7.5.3: Construct RE for the DFA in figure 7.19. 


0 
ie 
~ ; 
—_—_——_—_> —_—_———_> 
0 
eee 


Figure 7.19. Transition Diagram. 


Solution: 
gq =€, 2 = 910+ 930 and 93 = 420. 
Computing q : 
q2 = 910 + 430 
q2 =€0+q20-0 
q2 =€ 0(00)* 


=> RE =0- (00)*. 


EXAMPLE 7.5.4: Construct RE for the DFA in figure 7.20. 


— 
SS 


a,b 


a, b 
a 
b 


Figure 7.20. Transition Diagram. 


Solution: 
qi =€ 
q2 = qi(at+b) + q3(a+b) + qa(at db) 
q3 = q2a 
94 = qrb 
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Now, computing q2: (because computing for g3 leads to complications). 
=> q=qlatb)+q3(at+b)+q4(at+b) 
gz =€ (a+ b) + qoa(a + b) + qab(a + b) 
gz =€ (a+b) + glata+b)+b(a+b)) 
q2 =€ (a+b) - (a(at+ b)+b(a+b))* 
=> RE = (a+b): (a(at+b))* - (b(a+b))*. 


EXAMPLE 7.5.5: Construct RE for the DFA given in figure 7.21. 


a,b 
i 
start a,b a 
—_—_—> > —_—_—_____> 
~_4 
_— a 


a,b 
Figure 7.21. Transition Diagram. 
Solution: 
m= g=qla+b). 
93 = qrat 44° a, 
94 = q3(a + b), and 
95 = q2b + q4b + qs(a + b). 
Sincegi =€, => qo =€ (a+b) 
Now, computing q3: 
93 = (a+b)at+q4-a 
g3 = (a+ b)at+ q3-(atb)ja 
q3 = (a+ b)a- (a(a+ b))* 
= RE = (a+b) -a(a(a+b))*. 
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EXAMPLE 7.5.6: Construct a RE for the given transition diagram. 


a,b 
i 
©) 
a ad 
a,b 


Figure 7.22. Transition Diagram. 


Solution: 
n=gq(atb)+e 
qz = qi(a+b) 
Computing for qo: 
qz = qi(a+b) 


q2 = qrlat+b)(at+b)+ € 
qo = ((at+ b)\(a+b))* € 
=> RE = ((a+b)(a+b))*. 


7.9.2 Construction of RE from the given DFA by using the 
Transition Diagram 


The following are some transition diagrams and their equivalent REs. 
a. Ifa transition diagram has no final state, then RE = @, i.e., FA does not accept any 


string. 
a aaa 


Figure 7.23. RE=@ 


b. If a transition diagram has the same start and final states, then RE =é, i.e., FA 
accepts one of the strings as €. 


; 


Figure 7.24. RE =e 


c. For a transition diagram, with transition to itself (say for the input string b or (a or 
b)), then RE = b* or RE = (a+b)*. 
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b a,b 
—-@©- -©- -@O+-@)-— © 


RE = b* RE = (a+b)* b 
RE = b* 


(i) (ii) (ii) 


Figure 7.25. Transition Diagram with Transition to Itself. 


d. A transition diagram, which goes to the final state with input strings a and b, the 


RE = ab. 
—()++@) ++ 


Figure 7.26. RE = ab. 


e. Transition diagram with cycle. 


@ One-cycle: For a transition diagram with one cycle, for the input (say a or b), the 


RE = (ab)*. 
a 
i 
@ _©@) One cycle: qi — q2 — 41 
b 


RE = (ab)* 
Figure 7.27. Transition diagram for one cycle 


@ More than One cycle (independent cycles): For a transition diagram with more 
than one cyle, with input strings a and b as part of the cycle, the RE is 
RE = (ab)* + (ab)* +... or RE = (ab+ab+...)*. 


b 


a 
@)_ ~& 
a 


b Two cycles: (i) gi-q2-q1 
Gi) gi-93—- 41 
RE = (ab)* + (ab)* or RE = (ab + ab)* 
Figure 7.28. Transition Diagram for more than One Cycle 
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EXAMPLE 7.5.7: Construct the regular expression, from the DFA’s, given below: 


a, b 
—@©=-ED 


Figure 7.29. State Diagram. 


i. 


Solution: 
L(DFA) =e or RE =e. 


This DFA accepts € because it goes from the start state to the final state, without reading 
any symbol or alphabet (i.e. by reading empty string €). It accepts nothing else because any 
non-empty symbol would take it to state 2, which is not a final state and it stays there. 


a,b 
— —> 


Figure 7.30. State Diagram. 


ii. 


Solution: 
RE=¢ 


This DFA does not accept any string because it has no accepting state. Thus, the RE is ¢. 


iii. b 


—Q+-CL_O-=© 


a,b 


Figure 7.31. State Diagram. 
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Solution: 
RE = a(ab)*aa. 


This DFA has a cycle 1-2-1 and it can go through this cycle any number of times, by reading 
the substring ‘ab’ repeatedly. 

To find the RE: 

First from the start state it goes to state 1 by reading one ‘a’. Then, from state 1, it goes 
through the cycle 1-2-1, any number of times, by reading the substring ab any number of 
times and comes back to state 1. This is represented by (ab)*. Then from state 1, it goes to 
state 2 and then to state 3, by reading aa. Thus, the RE is a(ab)*aa. 


iv. b 
“sO 
—_4_ + a 
a,b 
b 
(:) ee 
Figure 7.32. State Diagram. 
Solution: 


RE = (ab + bb)* 


This DFA has 2 independent cycles 0-1-0 and 0-2-0. It can move through these cycles any 
number of times, in any order, to reach the accepting state from the initial state, such as 
0-1-0-2-0-2-0. Thus, the RE is (ab + bb)*. 


Vv. a a 


SHAS Do 
© 


Figure 7.33. State Diagram. 
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Solution: 
RE = a(bba + baa)*bb. 


This DFA has two cycles. They are 1-2-0-1 and 1-2-3-1. 
To find the RE of this DFA: 

First from state 0, go to state 1 by reading a (any other state, which is common to these 
cycles, such as state 2, can also be used instead of state 1). Then, from state 1 go through the 2 
cycles 1-2-0-1 and 1-2-3-1, any number of times (in any order), by reading the substring bba 
and baa respectively. At this point, a substring a(baa + bba)* would have been read. Then, 
go from state 1 to state 2 and then to state 3 by reading bb. Thus, altogether a(baa + bba)* bb 
would have been read, when state 3 is reached from state 0. 


: b 
Vi. (CY 
=-©—+-9Q——©» 
b a ~ 
Figure 7.34. State Diagram. 
Solution: 


RE = bD* + b*a(ba)*. 


This DFA has 2 accepting states 0 and 1. Thus, the RE of this DFA is the union of the RE 
corresponding to states 0 and 1. 

The RE at state ]: First at state 0, read any number of b’s, then go to state 1 by reading 
one ‘a’. At this point (b*a) will be read. At state 1, go through the cycle 1-2-1, any number 
of times, by reading the substring ba repeatedly. Thus, the RE is b* + b*a(ba)*. 


ahs 
Or 


Figure 7.35. State Transition Diagram. 


Vil. 


© 
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Solution: 
RE = (aa + bb)(ba)*(a+ b)*. 


Viil. 
@) #-@)—*—- @)—+—-@) +-@)D 
——_> —_—— —_—_———> —_—_ —_—_—_> a,b 


a,b 


Figure 7.36. State Transition Diagram. 


RE = (a+b)(a+ b)ab(a + b)* 


7.5.3 To Find the Language Acceptence of FA from Transition 
Diagrams 


As discussed in chapter 1, a FA is said to have accepted a string ‘s’ where s = w1, W2,...Wn, 
if there is a path in the transition diagram, such that, it 


a. begins at a start state, 
b. ends at an accepting state and 
c. has a sequence of labels w1,w2,...,Wn. 


In general, to ascertain the language accepted by FA, first find the RE from the given FA, 
say ‘M’, then L(M) = RE. 


EXAMPLE 7.5.8: What is the language accepted by the following DFA’s? 


i 0 
1. ae 
C) G) 
0 
~~ 


Figure 7.37. State Diagram. 


Solution: 
RE = 0.(00)* = L(M) = 0(00)* 


i.e. a machine which accepts all zeroes. 
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li. 
a b 


(\ (\ et 
_6--@--O 


Figure 7.38. State Diagram. 


Solution: 
RE = a*b* = L(M) = a*b* 


i.e. a machine that accepts any number of a’s or any number of b’s. 


iii. 
—-O-QL-O 
som 
Figure 7.39. State Diagram. 
Solution: 


= (a+b)-(a(a+b))* - (ba + b))* 
=> L(M) = (a + b)(a(a + b))*(b(a + b))* 


"@ 
(a) —t— (san) (sat) —* (= )—-— (#) 


“Seb 


Cs 


Figure 7.40. State Diagram 
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Solution: 
RE = aba + ba => L(M) = aba+ ba 


i.e. a machine that accepts words aba or ba. 


Vv. a,b 
CN 
—— 
gor 
Se) 
rr a 
b 
ae 
a,b 
Figure 7.41. State Diagram. 
Solution: 


RE = a(a+b)*a+b(a+b)*b 
Thus, L(M) = a(a+ b)*a+b(a+b)*b 


i.e. a machine that accepts words that begin and end with ‘a’ or begin and end with b. 


Vi. ab, ba 
aT 
start } 
oF “ " 
ad 
oo, ab, ba 
aa, bb 


Figure 7.42. State Diagram. 


Solution: 
RE = (aa + bb)* - (ab + ba)* => L(M) = (aa + bb)* - (ab + ba)* 


i.e. a machine that accepts even number of aa and even number of bb. 
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7.5.4 Construction of a DFA for a given R.E 


EXAMPLE 7.5.9: Construct a DFA for the RE given below: 


a. RE = ¢. 
Solution: 
Figure 7.43. DFA that Accepts Nothing; since it has no Final State. 


b. RE =e over b= {a,b}. 
Solution: 


@©Q-+-©) One : 
—©O+@ « —O—-@D: 


Figure 7.44. DFA for Accepting € (Since start and final states are the same). 


c. RE=(a+b)* 
Solution: 


a,b 
a 
Figure 7.45. DFA that accepts € or a’s or b’s. 


d. RE =a(a+b)* 
Solution: 


a,b 
Start a \\ a 
=O+©? « —©+O> 
r\ 
Oo 


Figure 7.46. DFA that accepts a and ends with a’s or b’s. 
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e. RE =(a+b)(a +b)* over & = {a, b} except €. 


Solution: 
—-(#)*-@) ab 


Figure 7.47. State Diagram for DFA. 


f. RE =(a)*(ba)*b* over & = {a, b} except €. 


Solution: 
b 
a 
a b 
Ce) @D 
a 


Figure 7.48. DFA to Accept a*(ba)*b*. 


g. RE=¢ +aaa+bbbb over = = {a,b}; accept only words ¢, aaa, bbbb 
Solution: 


ee 
<_—___————_ << 
a 


Figure 7.49. State Diagram for DFA. 


— 


h. RE =(a+b)* aba 
Solution: 


CO 


Figure 7.50. DFA that Accepts Only those Words that end in aba. 
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i. RE =(a+b)a[(a + b)a]* 
Solution: 


a,b 
a 
= : 
—> — > 
a 
a 
~~ b 


Figure 7.51. DFA that accepts all words whose second letter is a; have an even number 
of letters in total and a in every even position. 


j. RE =baa + ab + abb. 
Solution: 


Figure 7.52. DFA for RE = baa + ab + abb. 


k. RE =(atb)* aa (a+b)* 
Solution: 


b 


i 


oat au a Oa 
, 


a,b 
Figure 7.53. DFA that accepts all words containing substring aa. 
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7.6 Equivalence of NFA and Regular Expressions 


From a given RE, we can construct an NFA accepting the same language, defined by a given 
RE. 


Theorem IV 
For every regular expression r, there exists an NFA with €-transitions that accepts L(r). 
T.P.T: We can construct an NFA accepting the language, defined by a given RE. 


Proof: This theorem can be proved by induction, on number of operators in RE ‘r’, i.e., 
there is an NFA N with €- transition having one final state and no transition from that 
state. Hence, L(N) = L(r). 


Basis: (zero operators) — There are three possible REs, having no operators, as shown in 
the figure 7.54: 


NFA for r=0 


NFA for r=a : —-(»)—-@) 
aed 


Figure 7.54. Automata forr =€ orr = ¢ orr =a. 


There can be three different cases, depending on the form of RE. Assume, that the 
theorem is true for RE with some operators i > 1: 


Case l: r =r, +1. 


Let N; and N2 be two NFA’s with €-transitions accepting languages L and Lz, defined 
by the REs r; and rz respectively. 


Ni = (Qi, %1,51,q1, fi}) and Nz = (Qo, X2, 62,92, {f2}) with L(M) = Lin), 
L(N2) = L(72). 
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Assume, that Q; = Q2 = @¢, qo be a new initial state and fo be a new final state. Then, 
construct 


= (Qi U Q2 U {qo, fo}, Zi U Ze, 4, qo, {fo}) 
whose 6 is defined by 


a. 5(40,€) = {41,92}. 

b. 5(g,a) = 61(g,a), gE Qi — ffi}, a€ D1 U {E}. 
c. 8(g,a) = 52(9,a), gq € Q2 — {fr}, aE U2 U {€}. 
d. 5(f1,€) = 51(f, €) = {fo}. 


The NFA for the above transition is given in figure 7.55. 


Wao 


crows 


Figure 7.55. Automaton for (rj + 12). 


The above diagram shows that: 


@ Any path of N from qo to fo must begin by either going to q; or q2 on €. 
@ Ifa path goes to q1, then follow any path in Nj tof; and then go to fp on €. 
@ Ifa path goes to qo, then follow any path in N2 to fo and then go to fo on €. 


Thus, L(N) = L(N,) U L(N2) as desired. 
Case 2: r =r) -1r2 


Let N; and N2 be two NFA’s with €-transitions accepting languages L and Lp, defined 
by REss r; and rz respectively. 


MN = (Qi, X1,51, 91, f1}); N2 = (Q2, X2, 52, go, {f2}) with L(N1) = L(r1), L(N2) = 
L(r2). 
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Now, with q; and f) as initial and final states, construct 


N = (Qi) UQo, D1 U 22, 5, q1, {f2}) whose 6 is defined by 


a. 5(q,a) = 451(9,a), gE Qi — ffi}, ae 2 U {e} 
b. d5(f1, €) = {92} 
c. 5(g,a) = 52(g,a), gE Q2, aE L2 U {E}. 


NFA for the above transition is given in figure 7.56. 


Figure 7.56. Automaton for ry - 2. 


The above diagram shows that: 


m@ Any path of N, from q; to f2 must begin by going from q; to f; for some string x, 


followed by f| to q2 on €, followed by any path from q2 to f2 for some string y. 


Thus, L(V) = {xy|x € L(Nj), y € L(N2)} and L(N) = L(N}) - L(N2), as desired. 


Case 3: r = r;* 
Let N, be with €-transition and accepting the language L), defined by the NFA RE r;. 
NM = (Qi, 21,561,491, {fi}) with LW) = L(71). 
Let qo, fo be the initial and final states. Then, construct 
N = (Qi U {q0,fo}, 21,5, go, {fo}) 


where 6 is defined by 


a. 5(go, €) = 5(f1, ©) = {91.fo}. 
b. 5(g,a) = 51(4,4), gE Qi — ffi}, a@€ 21 U {fe}. 


The NFA for the above transition is given in figure 7.57. 
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€ 


eS 
€ 3 
=@)=+(@) = ©)+-@ 
New 


final 
state 


t { 


Original Original 
start final 
state state 


Figure 7.57. Automaton for rj. 


The above diagram shows that: 

any path from go to fo consists either the path from qo to fo on € or a path from 
qo to fi on €, followed by some number (possibly 0) of paths from q; to fi, then 
back to q; on € (in L}, labelled by a string) followed by a path from q to f; (ina 
string in L), then to fo on €. 


Thus, L(N) = L(N,)* as desired. 


Hence, proved. 


EXAMPLE 7.6.1: Construct an NFA with €-moves for 00* + 1. 
Solution: 


Let r = 00* + 1. 


=>r=r+r where r; = 00*, n=1 


a. Construct a machine for r;: 


@ To accept 0: 
= : 
a —__ 


€ 


= : : = 
———— ——_— —_—_—_— —_—_—_> 
ee ee 


w To accept 0*: 
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w To accept 00*: 
oo 


= 
—— > 
c. Construct a machine for r: 
OOOO («) 
€ 


b. Construct a machine for r2: 
m@ To accept 1: 


state 


state 


Figure 7.58. Automaton for r =r; +12 


EXAMPLE 7.6.2: Construct an NFA for r = (a + bb)* - ba*. 


Solution: 
Let, r = (a + bb)* - ba®™. 


>r=r-r2 where, ry = (a+ bb)* and r2 = ba’. 


a. Construct a machine for r): 


@ To accept a: 


@ To accept b: 
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ae a @ 
mw To accept bb: 
(+) +) © 
— a ree a > 


mw To accept a + bb. 


w To accept b: 


(«)++(#)-— 
a a oo) 
—(«) 
i, 
~(@) (a )£-(@)* (aw) 


m@ To accept (a + bb)*. 


b. Construct a machine for ro: 


m@ To accept b: 
start 
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m@ To accept a*: 
€ 


——$—$-———>{ 4,3 | ——| aia | ———( Qh5 | —_—> 
i Se oa 


€ 


: - - : (as)—! 
Pe ee 


@ To accept ba*: 


c. Construct a machine for r: 


€ 


= 
—G~ 
; : : ; ; 
a —> —> ors rr 
final 
state 
OS 


Figure 7.59. Automaton for r =r, - rz 
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EXAMPLE 7.6.3: Construct an NFA for (0 + 1)* 00 (0+ 1)*. 


Solution: 


Let,r = (0+ 1)* 00 0+ 1)”. 


>r=nrn-n-r where 7; = (0+ 1)*, rm = 00, 73 = (0+ 1)*. 


a. Construct a machine for r}: 


@ To accept 0: 


cad aa 

a 
@) 

E =o E 

we og 


“© 
OnOns 


@ To accept 1: 


@ To accept (0+1): 


@ To accept (0 + 1)*: 
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b. Construct a machine for r3: 


“ : 


c. Construct a machine for rp: 


w To accept 0: 


—@)-@® 


@ To accept 0: 


@)-+-©) 
_.(@) 2. 


@ To accept 00: 
~ (an) 1 (an) ++ (ae) =) 
———— SS a oS 
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d. Construct a machine for r: 


go ao 
OL 


Figure 7.60. Automaton for r = ry - 12-73 
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Define a regular set. 

Show that regular languages are closed under kleene closure and complementation. 
Show that regular languages are closed under union and concatenation. 

Show that regular languages are closed under quotients. 

For the DFA,D given in figure 7.61, create a DFA, D’ that accepts language L. 


= : 
_——_—_—_—> —_—_> ————> 


a,b 


Figure 7.61. DFA D to Accept Words of |w| mod 3 = 0. 


Vee SNS 


6. Construct a finite automaton, whose language is the intersection of the two languages 
L, and Lz that are accepted by machines M; and M2, as shown in figures 7.62 and 
7.63 respectively. 


a 


{_) 2 
Le 
° a 
ate —- @) (2) )s» 
b 
SS” 


Figure 7.62. State Diagram M,. 


a,b 
Pa b 
pee 
WR 
a 


Figure 7.63. State Diagram M2. 
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7. Obtain a regular expression for the DFA given in figure 7.64. 


1 1 0,1 

{ \ 2 {_) 
a 

—@) @)--© 
a 


0 


Figure 7.64. State Diagram DFA. 


8. What is the language accepted by the following DFAs? 


= - 0 0 -Y 


| 
7 


~_! UE ~_! 
Figure 7.65. State Diagram. 
b . ex b 
ee 


—G+O2 QO 


Figure 7.66. State Diagram. 


9. Prove that, for every regular expression r, there exists an NFA with €-transitions that 
accepts L(r). 

10. Construct NFA for the following regular expressions. 

0*0* + 1* 

OF 1*2* 

(0 + 1)*00(0 + 1) 

o* 4 1* + 2* 

(ab* + b)* 

(a+ b)* (aa + bb) (a+ b)* 


monomer p 
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‘Inputs to the nature yield rewarding outputs.’ 


Introduction 


In the previous chapters, an abstract model of a machine that accepts input values but doesn’t 
produce any output values, was discussed. Some machines, however, produce output values 
after accepting input values. In the case of FSA (finite state automata), movements from 
state q; to gj depend on the input at g;, and no output emerges. However, in the case of FSM, 
a move from a state q; to state g; results in an output. Consequently, a FSM possesses two 
special features: a finite set of output symbols and an output function’: Q@x © > T, 
where & is the input alphabet. Thus, one major limitation of finite automata is that its output 
is limited to a binary signal: ‘accept’ or ‘don’t accept’, to indicate the acceptance or rejection 
of an input string respectively. In order to make finite automata to have output capabilities, 
two classical machines are designed that transform input strings into output strings. 
They are 


a. Moore machine and 
b. Mealy machine. 


A Moore machine is a FSM — Mo, named after Edward Moore, who introduced it in 
1956. 

A Mealy machine is a FSM — Mz, named after George H.Mealy, who introduced it in 
1955. 

These machines are basically DFAs, except that they associate an output symbol with 
each state or with each state transition. However, there are no final states, because there is 
no acceptance or rejection involved. 


Note-1: 


The purpose of moore and mealy machine is not to answer yes or no, to 
accept or reject a string. They are not a language recogniser but output 
producer. 
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8.1 Moore Machine 


Definition 
A moore machine is a finite state automaton, where the outputs are determined by the current 
state alone. 

A moore machine associates an output symbol with each state, and each time a state is 
entered, an output is obtained simultaneously. So, the first output always occurs as soon as 
the machine starts. 


8.1.1 Elements of Moore Machine 


A moore machine has the following six charactcristics: 


A finite set Q of states go, q1,..-n- 

A finite input alphabets of letters, for forming the input String, )> = {a,b...}. 

A finite output alphabet of possible outbut characters, F = {x, y,...}. 

A transition function ’5’ that shows, for each state and each input letter, what state is 

reached next. 

e. An output function ’A’ that shows what characters from I" is printed by each state 
that is entered. 

f. An initial state go. 


ao Tf 


8.1.2 Ordered Six-tuple Specification of Moore Machine 


Formally, the moore machine Mo is represented by 6-tuples 
Mo = (Q, 2,4, g0, 1,4) 
where 


Q is a finite set of states, 


Z is a finite set of input symbols, 

6:Qx X — Qis the transition function, 
go € Q is the initial state, 

T is a finite set of output symbols, 

X.: Q — I is the output function. 


Here, the output of Mo, in response to inputs q1,q2,...,dn, n = 0, is A(go), A(q1), 
.-) A(Qn) and A(qo)=€. Therefore, the output sequence of Mo consists of (n + 1) symbols. 
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Also for the sequence of states q1,...qn, 5(gi-1,4i)) = qiforl] <i<n. 

In other words, for a Moore machine, each state produces a one-character output 
immediately, upon the machine’s entry into that state. At the beginning, the start state 
produces an output before any input has been read. Thus the output of a Moore machine is 
one character larger than its input (7 + 1, where n is number of characters in an input string). 


8.1.3 Description of a Moore Machine 


The transitions of a Moore machine can be represented using transition diagram and 
transition table. 


Transition diagram: 


The transition diagram for a Moore machine will include the output for each state. Each 
circle of a transition diagram is labelled with a compound symbol q;|x, to indicate that 
5(qi,a) = qj and A(qj, a) = x. 


EXAMPLE 8.1.1: If the output associated with a state qo is x, then it is written as qo/x inside 
the circle. A typical transition diagram for a Moore machine is shown in figure 8.1. 


a 
C ama 
Figure 8.1. Typical transition diagram of a Moore machine. 


Transition Table: 


Table 8.1 Transition and output table 


Since in Moore machine, every state is associated with output, the transition table is 
called transition and output table. The rows of the table correspond to states and columns 
correspond to inputs and output. Entries correspond to next state and its output. 


The transitions and output can also be represented in separate tables as, shown in 
table 8.2. 
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SE ome] RET] 


Output table Transition table 
Table 8.2 Transition and output table. 


EXAMPLE 8.1.2: Consider a Moore machine, whose transition diagram is shown in 
figure 8.2. 


a 


a,b 
— (e%) — 


Figure 8.2. Transition diagram of moore machine. 


The transition table for the above diagram is given in table 8.3. 


Table 8.3. Transition and output table 


8.2 Design of a Moore Machine 


Basic design strategy for Moore machine is as follows: 


a. Understand the problem definition for which the Moore machine has to be designed. 
b. Determine the required alphabet and state set. 
c. Determine the required output set. 
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d. Foreach state, decide on the transition to be made for each character of the input string. 
e. For each state, decide the required output. 

f. Obtain the transition table and diagram. 

g. Test the Moore machine obtained on short strings. 


Note-2: 


m There are no accept states in a Moore machine because it is not a language recogniser 
but an output producer. 

@ Ina Moore machine for the input string of length n, the output sequence consists of 
(n + 1) symbols. 


EXAMPLE 8.2.1: Construct a Moore machine, for the following: 


Input alphabet X = {a,b} 
Output alphabet" = {0,1} 
States Q = {q0,91,92,93} 


Table 8.4 Transition and Output Tuble 
Solution: Given & = {a,b}, = {0,1} and Q = {qo, q1,q2,q3}. The Moore machine is 


shown in figure 8.3. 
=. @—— 


b b 


FO ae 


Figure 8.3. Transition Diagram of Moore Machine (Mo). 
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M, action for the input string: 
For the input string S = bababbb, where n = 7, what is the ouput string? 


@ 1(5(q0, €)) = A(Go) = 0 
B® A(5(go,b)) = A(q2) = 1 
@ 1(5(go, ba)) = A(5(5(go, b), a) 
= \(8(q2,4)) 
= 4(q2) 
= 1 
m@ A(5(qo, bab)) = A(5(5 (Go, ba), b)) 
= (5(q2, b)) 
= h(q3) 
=0 
@ A(6(go, baba)) = X(5(8(qgo, bab), a) 
= 1(5(q3,4)) 
= (qo) 
=0 
@ (5(go, babab)) = 24(6(5(qo, baba), b) 
= (5 (qo, b)) 
= A(q2) 
=] 
@ i(8(qo, bababb)) = d(5(8(qo, babab), b) 
= 1(5(q2,b)) 
= A(q3) 
=0 
@ i(5(qo, bababbb)) = 24(5(5(go, bababb), b) 
= 1(5(q3, b)) 
= A(q1) 
=0 
Thus for the input string S, the output is 01100100 (note n = n + 1). 


EXAMPLE 8.2.2: Construct a moore machine for the following: 
Input alphabet & = {a, b}, 
Output alphabet = {0, 1} 
States Q = {go.q1}- 
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Table 8.5 Transition and Output Table 


Output table Transition table 
t = b 
Q outpu' Q a 
qo 0 qo go | 41 
1 1 71 qi | 4 


Solution: The transition diagram of a Moore machine for & = {a,b}, = {0,1} and 
Q = {qo0, 1} is shown in figure 8.4. 


a 
a,b 
— (wo) ++ (a) 


Figure 8.4. Transition Diagram of Moore Machine (Mg). 


M, action for the input string: 
For the input S = abab , what is the output string? 
@ A(d(go, €)) = A(go) = 0 
@ A(5(go,5)) = A(go) = 9 
B A(5(go, ab)) = A(5(5(4o, a), d)) 
= A(8(go, )) 
= (1) 
=1 
@ A(5(qo, aba)) = 4(5(5(qo, ab), a)) 
= A(8(q1,4)) 
= (41) 
=1 
@ A(d(qo, abab)) = 4(5(5 (go, aba), b)) 
= (6(q1,5)) 
= (41) 
=1 
Thus, output string = 00111. 


Similarly, to get the output string 000111, the input string required is aabaa or aabbb or 
aabab. 
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EXAMPLE 8.2.3: (A simple traffic signal problem) 


Suppose, there is a simple traffic intersection, where a north-south highway intersects an 
east-west highway. Assume that east-west highway always has a green light unless some 
north-south traffic is detected by sensors. When north-south traffic is detected, after a certain 
delay, the signals change and stay that way for a fixed period of time. It is required to design 
an appropriate circuit to capture the result stated above. 


Solution: We choose Moore machine as a model for the required circuit, as follows: 


(NX a 


a eg 


Figure 8.5. Moore Machine Model for Simple Intersection Problem Traffic. 


The input symbols for the required Moore machine are 0 (no traffic detected) and 1 (traffic 
detected). Let G, Y and R mean green, yellow and red. The output strings are GR, YR, RG 
and RY, where the first letter of the string is the colour of the east-west light and the second 
letter of the string represents the colour of north-south light. 


Edward Forrest Moore (1925-) was born in Baltimore, Maryland. He graduated 
from Virginia polytechnic institute in 1947 and received his Ph.D in mathematics 
from Brown University 3 years later. He worked as the faculty of the University of 
Wisconsin, Madison, and taught there until his retirement in 1985. 

Moore has made outstanding contributions to the logical design of switching 
circuits, automata theory, graph theory and database management. 


EXAMPLE 8.2.4: Construct a Moore machine to compute the number of substrings of 
the form bab, that occur in an arbitrary input string, over the alphabet {a,b} and output 
alphabet {0, 1}. 


Solution: Let Mp = (Q, 2,5, go, ,A) be the Moore machine. 


Consider input alphabet = {a,b},qo = qo, 
Output alphabet = {0,1}, and 
statesQ = {q0,4q1, 92,93}. 
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The Moore machine to compute number of substrings of the form bab, from a given 
string is shown in table 8.6 and figure 8.6. The machine produces a output character ‘1’, 
each time it finds bab. 


Transition table Output table 
Q . output 

qo 0 

q1 0 

q2 0 

93 1 


Table 8.6 Transition and Output Table 


Transition diagram: 


a 


a a 


2 ao —— 
— (ye) += Ge) = on) 
ore a 


b b 


Figure 8.6. Moore Machine. 


M, action for the string: 


The output of this Moore machine for simple string abababaababb is 0000101000010. 
We can count the number 1’s in the output string, to obtain the number of occurrences of 
the substring bab i.e., 


input string: abababaababb 
output string: 0000101000010 


Number of 1’s in output string = 3. 


Thus, the substring bab appears thrice in the input string. 
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8.3 Mealy Machine 


Definition 
A Mealy machine is a finite state machine, where the outputs are determined by the current 
state and the input. 


A Mealy machine associates an output symbol with each transition and the output depends 
on the current input. 


8.3.1 Elements of a Mealy machine 


A Mealy machine features the following six characteristics: 


A finite set Q of states go, 91,..-.4n- 

A finite input alphabet of letters, for forming the input string & = {a,b,...}. 

A finite output alphabet of possible output characters [ = {x,y,...}. 

A transition function 5 to show, for each state and each input, what state is entered. 
An output function A to show what chracter from I is printed by traversing from one 
state to another. 

An initial state go. 


i I 


mh 


8.3.2 Ordered Six-tuple specification of a Mealy machine 


Formally, the mealy machine M, is represented by 6-tuples 


M = (Q, 2405 90; r, i) 
where 


Q is a finite set of states, 

D is a finite set of input symbols, 

6:Qx X —> Qis the transition function 
- go € Qis the initial state, 

r is a finite set of output symbols. 

A4:Qx xX — TP is the output function. 


Here the output of M,, in response to input a1, 4a2,...,an,n > 0, is A (go, a1), (G1, 42),.-- 
A (qn—1,4n), Where 1, q2,..- Qn is the sequence of states such that 


6(qi-1,4i) = giforl <i<n. 
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The Mealy machine gives an output of € in response to input €. So, the output sequence 
has the length at n and not n + 1. In other words, the output string of Mealy machine will 
be of the same length as its input string. 


8.3.3 Description of a Mealy machine 


The transitions of a Mealy machine can be represented, using transition diagram or table. 
Transition diagram: 

The transition diagram for a Mealy machine will include the output for each transition edge. 
Each edge is labelled with a compound symbol a/b from state p to state g to indicate the 


5(p,a) = q and A(p,a) = b. Every state must have exactly one out-going edge for each 
possible input symbol. 


EXAMPLE 8.3.1: If the output associated with edge labelled with symbol ‘a’ is ‘y’, then it 
is written as a/y on that edge. A typical state transition for Mealy machine is represented, 


as shown in figure 8.7. 
——_ —_>- 


Figure 8.7. Typical Transition diagram of Mealy machine. 


Transition Table: 
The transition table for Mealy machine consists of two sub-tables. 
a Transition table: Rows correspond to states and columns correspond to inputs. Entries 
correspond to next states. 


@ Output tale: Rows correspond to states and columns correspond to inputs. Entries 
correspond to output. 


Present Input 


i ee 


BS 
E 


Table 8.7 Transition table Table 8.8 Output table 
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In the output table, b; .. . b, are the output symbols for the given input a] ... a, ina given 
State. 


EXAMPLE 8.3.2: Consider a mealy machine, whose transition are described in the transition 
diagram (figure 8.8). 


——— —> ———_—> 
b/l 


Figure 8.8. Transition diagram Mealy machine. 


The transition table for the above transition diagram is shown in table 8-9 and table 8-10. 


Table 8.9 Transition table Table 8.10 Output table 


8.4 Design of a Mealy Machine 


Basic design strategy for a Mealy machine is as follows: 


Understand the problem definition for which a Mealy machine is required. 
Determine the required alphabet, state and output set. 

For each state, decide the transition to be made, for each character of the input string. 
Decide the output to be associated, with each edge label. 

Obtain the transition table, output table and transition diagram. 

Test the Mealy machine obtained on short strings. 


AMP wWN DS 
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EXAMPLE 8.4.1: Construct a Mealy machine for the following: 


Input alphabets & = {0, 1}. 
Output alphabets = {y, n}. 
States Q = {q0, Po, P1}- 


Transition table Output table 


Table 8.11 Transition and Output Table 


Solution: Let Me = (Q,2,6,qo,T,4) be the Mealy machine. Given © = {0,1},Q = 
{qo, po. Pi},T = {y,n}. Let go = qo, the functions 6 and i are shown in figure 8.9. 


Figure 8.9. Transition Diagram of Mealy Machine 
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M, action for the string: 
For the input string 01100, what is the output string and the sequence of states entered? 


@ 45(go,0) = po. 
m 5(go,01) = 6(5(qo, 9), 1) 


= 6(po, 1) = pi. 
@ 5(qo,011) = 6(6(go, 011), 1) 
= 6(p1,1) = py. 
@ 45(go, 0110) = 6(6(go, 011), 0) 
= 5(p1,0) = po. 
@ 5(qgo,01100) = 5(6(go, 0110), 0) 
= 6(po0,0) = po 
inputs: FF OE KF 
States: qo Po PI Pi Po Po 
output: n n y n y 


Figure 8.10. Sequence State Diagram. 


Sequence of states entered for input 01100 is gopopi1pi1popo (number of states entered is 
m = 6, output sequence x = m — 1 i.e. x = 5). 
From the sequence state diagram: 


@ A(qgo,0) =n 
B X(p0,1) =n 
B A(p1,1) =y 
@ (71,0) =n 
B A(po,9) = y 


Thus, output sequence = nnyny. 


EXAMPLE 8.4.2: Construct a Mealy machine to print out 1’s complement of an input bit 
string. 


Solution: Let Me = (Q, %,5,q0,T,A) be the Mealy machine such that Q = {qo}, input 
and output alphabets © = I = {0,1} and qo = qo. The transition and output function to 
print 1’s complement of an input string is shown in table 8.12 and figure 8.11. The machine 
produces an output character 1, each time it finds an input character 0 and vice versa. 
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Transition and output table: 


Transition table Output table 
x r 
0} 1 0; 1 
Q Q | 
|g | 40| 4 gq (1/0 


Table 8.12 Transition and Output Table 


Transition diagram: 
o/1 
ye 
Figure 8.11. Transition Diagram for the Mealy Machine. 


M- action for the string 


For the input string 0101, what is the output string? 


input: JS di <p 
states: qo qo qo do qo 
output: i 0 1 0 


Figure 8.12. Sequence State diagram. 


Output sequence is: 


@ A(go,0) = 1 
@ A(go.1)=0 
@ A(go,0) = 1 
® A(go, 1) =0 


Thus, output = 1010. 


EXAMPLE 8.4.3: Construct a Mealy machine to compute the number of substrings of the 


form bab that occur in an arbitrary input string, over the alphabet {a, b} and output alphabet 
{0, 1}. 


284 


Downloaded from https://www.cambridge.org/core. Stockholm University Library, on 06 Dec 2018 at 07:47:21, subject to the Cambridge Core terms of use, available at 
https://www.cambridge.org/core/terms. https://doi.org/10.1017/UP09788175968363.009 


Transducers 


Solution: Let Me = (Q,%,5,q0,T,A) be the Mealy machine, such that Q = 
{9o, 91, 92,93}, & = {a,b}, = {0,1} and go = qo. The transition and output functions, 
required to compute the number of substrings of the form bab in an arbitrary input string, 
are shown in table 8.13 and figure 8.13. The machine produces an output character 1, each 
time it finds a substring bab. 


Transition and output table: 


Transition table Output table 
r 
Q a|\b 
90 0/0 
vr 0/0 
q2 =) I 
3 0|0 


Table 8.13 Transition and Output Table 


Transition diagram: 


al0 7 a/O 
oo re rr ninnnne oe ——— > 


b/0 
b/0 
Figure 8.13. Transition Diagram of Mealy Machine. 
M, action for the input string 
For the sample string abababaababb the output is 000101000010, where each 1 indicates 


the availability of a substring up to that point. Each 0 indicates that three previous inputs 
including current input do not form a substring of the form bab 


Le., input string: abababaababb 
output string: 000101000010 
Number of 1’s in the output string = 3 


Thus, the substring bab appears thrice in the main string. 
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EXAMPLE 8.4.4: Construct a Mealy machine to count the number of occurrences of a 
substring of the form aa or bb, in an arbitrary input string, over the input alphabet {a, b} 
and output alphabet {0, 1}. 


Solution: Let Me = (Q, &, 8, qo, T°, A) be the Mealy machine such that Q = {qo, q1,q2}, 2 = 
{q,b}, T° = {0,1} and go = qo. The machine produces an output character 1, each time it 
finds that it has just seen a double letter. The transitions required to count the number of 
occurrences of aa or bb are shown in figure 8.14. 


Pa b/I 
b/0 
Ot+-OS6) 
a, 


b/0 


Figure 8.14. Transition diagram for the Mealy machine. 


M. action for the input string 


input string: abaaababbbaa 
output string: 000110001101 
Number of 1’s in the output string = 5 


Thus the substring aa or bb appears 5 times in the main string. 


EXAMPLE 8.4.5: Construct a Mealy machine that reports the parity of each 4-bit substring 
in an arbitrary input string, over the alphabet {0, 1} and output alphabet {0, 1}. 


Solution: Let Me = (Q, 4,5, q0, 0,4) be the Mealy machine such that Q = {q0, 41, q2, 
43, 94,495,496}, & = {0, 1}, 7 = {0, 1} and go = qo. The machine, for each of the first 3-bits 
of each 4-bit substring, produces the output character 0. If a 4-bit substring contains an even 
number of 1’s, then the machine outputs a character 0 on the fourth bit of the 4-bit substring, 
otherwise it given an output of 1. 


For each of first 3-bit: Output 0 
4-bit substring } For the fourth bit: Output 1 if no. of 1’s are odd. 
Output 0 if no. of 1’s are even. 
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1-bit 2-bit 3-bit 4-bit 


Output String: Oor 1 


Figure 8.15. 4-bit substring the fourth bit contains 0 if no. of 1’s are even, otherwise it 
contains 1. 


The transitions of the Mealy machine are shown in figure 8.16. 


0/0, 1/1 


0/1, 1/0 


Figure 8.16. Transition diagram for the Mealy machine. 


M, action of the output string 
Consider the input string 000111011100. The output string is as follows: 


output 
string: 


odd no. of 1’s odd no. of 1’s even no. of 1’s 


Figure 8.17. Machine’s output an parity of each 4-bit substring. 


Thus, the output is 000100010000. 
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EXAMPLE 8.4.6: Design a Mealy machine to add two binary numbers of the form x1 x2. . . xx 
and yiy2...Yk- 

Consider x;, yje{0, 1}. Assume that x and y contain the same number of bits and the 
leftmost bits are zeros. Let x = (XXn-1---X1%0) and y = (ynYn-1.-.y1Yo)), Where 
Xn = yn = O. For example, to add x = 11101 and y = 01100, the x and y bits are 
appended with ‘0’ as the leftmost bit i.e., 


x =011101 and y = 001100. 
The corresponding bits x; and y; of x and y are added from right to left. 


x =01110 1 
y = 00110 0 
< add from right to left. 


Thus, for the addition of x; and y;, the following four possible choices exist. 


xi | y; | Result 
0 0 (addition of 0 and 0 produces a sum of 0 and no carry) 
0 


0 
1 | 1 (addition of 0 and 1 produces a sum of 1 and no carry) 
0 
1 


1 (addition of 1 and 0 produces a sum of 1 and no carry) 


10 (addition of 1 and 1 produces a sum of 0 and carry of 1) 
Table 8.14 Four possible choices for x; and yj; 


In other words, two binary numbers are added in pairs — 00, 01, 10 and 11. Addition of 
two bits either produces a carry or does not produce a carry. 

The Mealy machine to add two binary numbers consists of two states go and q1. The state 
qo corresponds to the sum of bits that does not produce a carry and state g; corresponds to 
the sum of bits that produces a carry. 


Processing at go: Sum of bits that does not produce a carry. 


Current state | Input | Output | Next state Meaning 
qo 00 0 qo 0+0=0, i.e., sum is 0 and no carry 
qo 01 1 qo 0+1=1, i.e., sum is 1 and no carry 
qo 10 1 qo 1+0=1, i-e., sum is 1 and no carry 
qo 11 0 NN 1+1=10, i.e., sum is 1 and carry is 1 


Table 8.15 Processing at qo 


288 


Downloaded from https://www.cambridge.org/core. Stockholm University Library, on 06 Dec 2018 at 07:47:21, subject to the Cambridge Core terms of use, available at 
https://www.cambridge.org/core/terms. https://doi.org/10.1017/UP09788175968363.009 


Transducers 


Processing at q,: Sum of bits that produces a carry. 


Current state | input | output | Next state Meaning 
nl 00 1 | qo 1+0+0=1, i.e., add the bit 0, 0 and the carry 1 
”q1 01 0 1 1+0+1 = 10, i.e., sum is 0 and carry is 1 
or 10 0 71 1+0+1 = 10, i.e., sum is 0 and carry is 1 
11 11 1 11 1+1+1 = 11, ie., add the bit 1, 1 and carry 1 


Table 8.16 Processing at q\ 


Thus the Mealy machine M, = (Q, 2,5, q0,T,A) 


where Q = {qgo,4q1} 


E = (0, 1} 
qo = 90 
r = {0,1} 


and 5, A are given in transition diagram shown in figure 8.17.1 


00/0, 01/1, 10/1 01/0, 10/0, 11/1 


( y 11/0 C ? 
aC 
~__ 
00/1 
Figure 8.17.1. Mealy machine to add two binary integer numbers. 


M, action for the input string 


Consider the input strings x = 11101 and y = 01100. The string x and y, with 0 as 
leftmost bit appended, are x = 011101 and y = 001100. 


Thus, 
x = 011101 


y = 00110 0 
Therefore, the input bits, from right to left are 10, 00, 11,11,10 and 00. 
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Now, 

input: ra fe Pa 1 va 1 i 10 vii 

States: qj or SS SS \ SS \ qo 
output: 1 0 0 1 0 1 


Figure 8.18. State sequence diagram. 


Thus, x + y = 101001 (the output string is written backaward). 


8.5 Difference between Moore and Mealy Machines 


a. Moore machine gives output (qo) in response to the input €, so the output sequence 
isn+ 1, not n. 
Mealy machine does not give output A(qo) in response to the input €, so the output 
sequence is n, notn + 1. 

b. Moore machine has actions associated with states or Moore machine prints characters, 
when in state. 
Mealy machine has actions associated with transitions or Mealy machine prints 
characters, when traversing an arc. 

c. The output of Moore finite state machine depends only on the current state and does 
not depend on current input. 
The output of Mealy finite state machine depends only on the current input. 

d. In Moore machine, if output associated with state q is x, then q/x is written inside 
the state circle of transition diagram. 
In mealy machine, if output associated with the edge labelled with the letter a is x, 
it is written as a/x on that edge. 


o Om 


moore machine mealy machine 


Figure 8.19. Transition Diagram for Moore and Mealy Machine. 
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8.6 Properties/Equivalence of Moore and Mealy Machines 


Neglecting the response of a Moore machine to input €, the Moore machine and the Mealy 
machine are equivalent. Thus given a Moore machine, we can construct a Mealy machine 
and vice versa. 


8.6.1 Construction of Mealy machine for a given Moore machine 


Theorem I: If M, is a Moore machine, then there is a Mealy machine M, that is equivalent 
to it. 
Proof: Let M, be a Moore machine given by 


M, = (Q, Z,4,q0,T,A). 
Let M, be a Mealy machine given by 
M. = (Q, =, 4,q0,T,2’). 
This theorem can be proved in a constructive manner, with 1’ defined as 
X'(q,a) = A(6(q,a)) forall g and a. 


Then, M, and M, enter the same sequence of states on the same input, and with each 
transition M, produces the output that M, associates with the state entered. 
Alternatively, M, is constructed from M, as follows: 


Consider any state gj of Mo. 
. Assume, M, prints the character ¢ upon entering qj. 
Hence, the label in state g; is q;/t. 
. Assume that there are n arcs entering q;, with labels a1, a2,..., dy. 
Now, create the machine M, by changing the labels on the incoming arcs from q; 
tO dm/t,m=1,2,...n 
Change the label of state g; to be just qj. 


Mo Me 
re ee 
=P =—===> _Mt 
eer eee 


Figure 8.20. Moore and its equivalent Mealy. 
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EXAMPLE 8.6.1: Convert the following Moore machine to an equivalent Mealy machine. 


a a,b 


a 
Figure 8.21. Moore Machine. 


Solution: Construct M, as 
Me. = (Q, x, 5, rs rN, qo) 
where A’(q,a) = A(5(q,a)), V g anda. 
Thus, 
B A’(qo,a) = 4(5(Go,@)) 
= A(qo) 
=0 > a/0 
@ A’(qo,b) = A(5(go, b)) 
= A(q1) 
=] => b/1 
mw A'(q1,4) = A6(q1,4)) 
= A(q1) 
=1 => a/1 


w A’(q1,b) = A(6(q1, 5)) 
= A(q1) 


The corresponding Mealy machine is given in figure 8.22. 
aS 
—@)" 


Figure 8.22. Mealy Machine. 


EXAMPLE 8.6.2: Convert the following Moore machine into an equivalent Mealy machine. 
Given: M, = ({qo, 91, 42, 93}, {a, b}, {0, 1}, 5,4, go) where, A and 6 are given below. 
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x 
Q a b 1 a 
qo M1 B 1 
M1 3 1 0 
2 90 GB 0 
Ck B q2 1 


Table 8.17 Transition and output table Moore Machine 


Solution: Based on the given data, the transition diagram of Moore machine is shown in 
figure 8.23. 


b 
: 
——————— 


COT 


Figure 8.23. Moore Machine. 


To construct a Mealy machine M, = (Q, £,5,T', A’, go), define 4’(g, a) = A(8(q,a)) Vq 


and a. 
Thus, 
m@ 2'(qo,4) = A(6(go,a)) = A(q1) = 0 > a/0 
@ 2'(qo,b) = A(6(go, b)) = 4(q3) = 1 => B/1 
m 4'(q1,4) = A(S(q1,4)) = A(z) = 1 > a/1 
@ A'(q1,b) = A641, 5)) = A(qi) = 0 = 6/0 
mw A’(q2,4) = A(5(q2,4)) = A(Go) = 0 = a/0 
@ 2'(g2,b) = A(8(q2,b)) = A(q3) = 1 => b/1 
@ A'(q3,4) = 4(5(q3,4)) = A(93) =1 > a/1 
w 1'(93,b) = A(8(q3, b)) = A(q2) = 0 = 5/0. 
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The corresponding Mealy machine is given in figure 8.24. 


b/0 
- 
—_— 


b/I a/l a/l 


“CO 


ee ts 


Figure 8.24. Mealy Machine. 


EXAMPLE 8.6.3: Convert the Moore machine into an equivalent Mealy machine. 
CY 
<4. 
b 
b a 
start a 
= (w) 


Figure 8.25. Moore Machine. 


Solution: From figure 8.25, construct M, by changing all the labels on incoming arcs from 
qi to a», |t, for M = 1,2,--- ,n 
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a/0 


b/0 a/1 
tart 
PN b/I 
b/1 


Figure 8.26. Mealy Machine. 


8.6.2 Construction of a Moore machine for a given Mealy machine 


Theorem II Let M,. = (Q, 2,6,4,qo0, I) be a Mealy machine. Then, there is a Moore 
machine M,, equivalent to Mz. 


Proof: Given M, = (Q, 2, 5,4, qo, I) as a Mealy machine. 


Let M, = (Qx, 2,5’, 4’, [go, bo], ") where, bo € (arbitrarily selected), i.e., the states 
of M, are pairs [q, b] consisting of a state of M, and an output symbol. 


Define: 5’({g, b],a) = [5(q, a), A(q, a)] and A’([g, b]) = b. 


The first component of M, determines the moves and the second component is the 
output of M, on some transition into state q. 


During the process of conversion from Mealy to Moore machine, the following cases 
need to be considered. 


Case-1: If we try to push the output from the edge to the inside of the state, as it 


should be for a Moore machine, we might end up with a conflict. Two edges might come 
into the same state having different outputs. 


Solution: Make two copies of the same state and label them. 
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e.g.: Consider the following diagram. 


Figure 8.27 State of Mealy Machine. 


There are two copies of the same state. The edge a/0 goes to [q1,0] and b/1 goes to 
[q1, 1]. The edge, labelled b/0, also goes to [q1, 0]. 


The equivalent state of figure 8.27 is given in figure 8.28. 


to) 
— SOE 
as es 


Figure 8.28 Equivalent Moore States. 


This procedure is repeated for all the states in the Mealy machine, where two edges 
come into the same state with different outputs. 


Case-2: Two edges in Mealy machine coming to the same state with one edge as loop 
and another as an arc. 


Solution: Here there are two copies of the same state with, 


i. one having loop 
ii. one without loop. 


e.g.: Consider the transition diagram in figure 8.29. 


b/1 


ey 
——_—— 


Figure 8.29 States of Mealy Machine. 
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Now, the edge labelled a/O has to enter a copy of q2, with output 0. The loop b/1 at 
q2 has to enter q2, with output 1. When we enter gz from a/0, we enter [q2, 0], but we 
must also be able to loop with b’s while staying in a q2 state. So, an edge must connect 
[g2, 0] to [q2, 1). 


: (oi) 
> 


Figure 8.30 State Equivalent of Figure 8.29 


As it must be allowed to repeat as many b’s as we want, there must be a ‘b’ loop at 
the state [g2, 1]. Each b loop, gone around once, prints another 1 when it enters [q2, 1]. 


Thus, case-2 procedure is repeated for all the states to get an equivalent Moore 
machine. 


Case-3: If we have to make copies of start state in Mealy machine, then we can work 


with any one of the copies as start state in a Moore machine. This is because they all give 
identical directions for proceeding to other states. 


EXAMPLE 8.6.4: Convert the Mealy machine in figure 8.31 into an equivalent Moore 


machine. 
a/0 
f_\ b/1 
— 
—_OmO), 
——— 
a/1 


Figure 8.31. Mealy Machine. 
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Solution: Given the Mealy machine M, = ({qo, 41}, {a, b}, {0, 1},6,q0,4), we construct an 
equivalent Moore machine M, = (Qx, U,T,8’,A’, [qo, bol), 


where, Ox _ {[go, OF [o, 1], (a1, 0), [q1, 1)}, L= {a, b}. 


We need to find out: 


8'([q, b], a) = [5(g, a), A(g, a)] and A’(q, b) = b. 


Thus, 


m 5'([q0,0],a) = [6(g0, 4), A(go, 4)] 
= (go, 0] 

m 6’([g0, 0], 6) = [5(go, b), A(go, b)] 
= [1,1] 

@ 5'((go, 1],4) = [5(go, 4), 4(go, a)] 
= [40,0] 

m 5'([go, 1],b) = [8(go, 6), 4(go, b)] 
= (91, 1] 


{5(q1,4), A(q1,4)] 


aS 5’([q1, 0], a) = 
== (go, 1] 


Similarly 5’([q1, 0], b) = [q1, 0] 


m 5'((q1,1],a) = [6(q1, 4), A(q1,4)] 
= [40, 1] 

m 5’((q1, 11,6) = [6(91,5),A(q1,5)] 
= [41,0]. 


We also find: 4’(qg,b) = bY states i.e., 


@ A’(q0,0) =0 

B A’(q0,1) =1 

@ V(q,1)=1 

| (41,0) = 0. 
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Finally, we draw the transition table and diagram for the Moore machine as follows: 


Table 8.18 Transition Table for Moore Machine 


a 


= 
—P ooo 


Figure 8.32 Equivalent M, for the above Me. 


EXAMPLE 8.6.5: Convert the Mealy machine in figure 8.33 into an equivalent Moore 


Machine. 
a 


AU: 


SS 
Figure 8.33 Mealy Machine. 


Start 
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Solution: Given, Me = ({qo,po.p1}; {0, 1}, {n,y}, 5, go, A). 
We construct M, = (Qx, &, T, 5’, [go, bol, A’) where, 


Qx = {[go,7], [go.y], (po. 7], [po. y], [p1.7), [p1,y]}}. 2 = {0,1} and T = {n,y}. 
To find: 5’([q, b], a) = [5(q, a), A(g, a)] and d’(q, b) = b. 


We have, 


@ 5’([g0,n],0) = [5(go, 9), A(go, 0)] = [po. 1] 
5'([go, 7], 1) = [8(go, 1), A(go, 1)] = [p1,7] 
5’([go,y], 0) = [5(go, 0), A(go, 0)] = [po, 7] 
5'([go,y], 1) = [5(go, 1), Ago, DI] = [pin] 
5’([po, 7], 0) = [8(p0, 0), A(Po, 9)] = [Po, y] 
5’([po, 2], 1) = [8(po0, 1), Apo, D)) = [1,7] 
5’([p1,], 0) = [8(p1,0), A(p1, 0)] = [po, 2] 

5’([p1.n], 1) = [8@1, D, A@i, YD) = [1,9] 
5'([p1,y],0) = [5(p1, 0), A(p1, 0)] = [po, n] 

§'((pi.y], D = [5@1, 1), A, D1 = [py] 
5'({po. y], 0) = [5(p0, 0), A(Po, 0)] = [po. ¥] 

5’({po.¥], 1) = [5(p0, 1), Ao, 1)] = [pi nd. 


We also find A(g, b) = BV states, 


w A’'([qo,n]) =n 
m A ([po,n]) =7 
@ A'([pi,n)) =n 
w A'([g0,y)) =y 
w X'([po, yl) =y 
m 1 ([pi.y) =y. 
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Transducers 


We finally draw the transition table and diagram as follows: 


Table 8.19 Transition Table for the Moore Machine 


eo 
start 0 en 
S-(q)—2— (me) 


Figure 8.34. Equivalent Mo for the above M.¢. 


1. What is a Moore machine? Present the formal definition of a Moore machine. 
2. What is a Mealy machine? Present the formal definition of a Mealy machine. 
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3. Bring out the differences between Moore and Mealy Machines. 
4. Design a Mealy machine to print 0’s complement of an input bit string. 


5. Design a Moore machine to compute the number of substrings of the form aba that 
occurs in an arbitrary input string over the alphabet {a, b} and output alphabet {0,1}. 
Trace the working of the machine for the simple string babababbabaa. 


6. Prove that for every Moore machine, there is an equivalent Mealy machine. 
7. Convert the following Mealy machine to an equivalent Moore machines. 


a. a0 

b/0 
al0 
b/0 
a/l 
start 
*.(«) re 
b/0 a0 
b/l 


Figure 8.35. Mealy Machine. 


b. 0/1 
Start 
—_ oS 


Figure 8.36. Mealy Machine. 
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a/0 a/0 


start b/0 a0 (@) b/I («) 
——_— ——__—____> ae a 


b/0 
b/0 


Figure 8.37. Mealy Machine. 
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Context-Free Grammars and 
Context-Free Languages 


‘Grammar is the mathematics of a language and mathematics is the grammar of creation.’ 


Introduction 


In any language such as English, Hindi or Sanskrit, words can be combined in several 
ways. Naturally, some combinations form valid sentences, while others do not. The validity 
of a sentence is determined by the grammar of a language, which comprises a set of 
rules. For instance, ‘The boy prepares tea quickly’, although meaningless, is a perfectly 
legal sentence. In other words, the sentences in a language may be nonsensical, but they 
must obey the rules of grammar. The discussion in this chapter deals with only the syntax 
of sentence (the way the words are combined) and not with the semantics of sentences 
(meaning). 

In the previous chapters, two different (though equivalent) methods: finite automata 
and regular expressions were introduced for describing languages. These methods have 
their own limitations in the sense that some simple languages, such as {0”1"|n > O}, 
cannot be described by these methods. Formal languages and grammars are widely used 
in connection with programming languages. During programming, we proceed with 
an intuitive knowledge of the languages, which leads to errors. Therefore, a precise 
description of the language is needed, at almost every step, which helps to understand 
the syntax diagrams found in programming texts. Among the ways in which programming 
languages can be defined precisely, grammars or context-free grammars are most widely 
used. This method happens to be a very powerful method and such grammars can 
describe certain features which have a recursive structure. Context-free-grammars (CFG) 
were first used in the study of human languages. Actually, the understanding of the 
relationship of terms such as noun, verb, preposition and their respective phrases, leads 
to a natural recursive process (because of the possibility of appearance of noun phrases 
inside verb phrases and vice versa). CFG are capable of capturing important aspects of 
these relationships. 

Now, an important application of CFG could be found in the specification and compilation 
of programming languages. A grammar for a programming language 
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a. facilitates the learning process of the language syntax, and 
b. provides reference for the designers of compilers and interpreters. 


Most compilers and interpreters contain a component, called a parser, that extracts the 
meaning of a program, prior to its execution. Construction of such a parser is possible 
only through CFG. Further, the collection of languages (associated with CFG) is called 
context-free language (CFL), which includes all the regular languages as well as some 
additional languages. In this chapter, a formal definition and properties of CFG and CFL 
are discussed. 


9.1 Some Important Illustrations 


9.1.1 CFG in programming languages 


The grammar that describes a typical language like PASCAL is very extensive. Hence, a 
smaller language (i.e. a part of it) is considered. 


For the set of all legal identifiers in PASCAL (which is a language), the grammar is as 
follows: 


<id> —><letter> <rest> 
<rest> —><letter> <rest> | <digit> <rest> | € 
<letter> —> a\b|---|z 
<digit> —> O|1|---|9 


In this grammar, the variables are < id >, < letter >, < digit > and < rest >, with 
a,b,...,z,0,1,...,9 as the terminals. 
The derivation of the identifier ag is 
<id> =><letter> <rest> 
=> a <rest> 
=> a <digit> <rest> 
=> ao <rest> 


=> ao. 
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The transition diagram for the above derivation is: 


Start, ® Letter, Letter of digit 


Digit 


X) Letter of digit 


Figure 9.1. Transition diagram. 


9.1.2 CFG in English language 


A language is meaningful, only if a grammar is used to derive the language. 
For example, English grammar has rules for constructing sentences as enlisted below: 


A sentence can be a subject followed by a predicate. 

A subject can be a noun-phrase. 

A noun-phrase can be an adjective, followed by a noun-phrase. 
A noun-phrase can be an article, followed by a noun-phrase. 

A noun-phrase can be a noun. 

A predicate can be a verb or a verb followed by a noun-phrase. 
A noun can be: person, fish, stapler, book, bird, dog. 

A verb can be: reads, touches, grabs, eats, sings, bark. 

An adjective can be: big, small, beautiful, wonderful. 

An article can be: the, a, an. 


The symbolic representations of these rules are: 


<sentence> —> <subject><predicate> 
<subject> —» <noun-phrase> 
<noun-phrase> —~> <adjective> <noun-phrase> 
<noun-phrase> —~> <article><noun-phrase> 
<noun-phrase> —~> <noun> 
<predicate> —» <verb> <noun-phrase> 


<predicate> —><verb> . 


Now, let us construct or derive the following sentence using the above rules. 
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a. ‘The small person eats the big fish.’ 
<sentence> => <subject> <predicate> 
=><noun-phrase> <predicate> 
=><noun-phrase> <verb> <noun-phrase> 
=><article> <noun-phrase> <verb> <noun-phrase> 
=> <article> <adjective> <noun-phrase> <verb> <noun-phrase> 
=> <article> <adjective> <noun> <verb> <noun-phrase> 
=> <article> <adjective> <noun> <verb> <article> <noun-phrase> 


=> <article> <adjective> <noun> <verb> <article> <adjective> 
<noun-phrase> 


=<article> <adjective> <noun> <verb> <article> <adjective> <noun> 
=> the <adjective> <noun> <verb> <article> <adjective> <noun> 

the small <noun> <verb> <article> <adjective> <noun> 

the small person <verb> <article> <adjective> <noun> 

the small person eats <article> <adjective> <noun> 


the small person eats the <adjective> <noun> 


{Ye UY 


the small person eats the big <noun> 
=> the small person eats the big fish. 
b. ‘The bird sings.’ 
<sentence> =><noun-phrase> <predicate> 
=><article> <noun> <predicate> 
=><article> <noun> <verb> 
=> the <noun> <verb> 
=> the bird <verb> 
=> the bird sings . 
c. ‘A dog barks.’ 
<sentence> =><noun-phrase> <predicate> 
=><noun-phrase> <verb> 


=> <article> <noun> <verb> 
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=> a <noun> <verb> 
=> adog <verb> 
=> adog barks. 


d. ‘Ram reads.’ 


<sentence> =><noun-phrase> <predicate> 
=> <noun> <predicate> 
=><noun> <verb> 
=> Ram <verb> 


= Ram reads. 
Thus, the language of this grammar is 


L={ 
‘a bird sings’, 
‘a dog barks’, 
‘Ram reads’, 
“small person eats the big fish’, 


we eee 


9.1.3 CFG in construction of tokens 


The smallest individual unit of a program is known as a token, such as identifier, constant, 
string, operator etc. For constructing identifiers in C language, the rules are listed below: 


m Identifier (id) is a combination of letters, digits, under score. 

m First character in identifier is either a letter or underscore., followed by any number of 
letters or digits. 

@ Identifiers are case-sensitive. 


The grammar for the above rules is: 


<id> —><letter> <rest> 
<rest> —><letter> <rest> | <digit> <rest> | €, 
<letter> —> —|a|b|...|z 


<digit> —> 0|1|2|...|9. 
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Now, let us construct or derive the following identifiers from the above rules. 


a. Py 


b. emp_12 


<id> => <letter> 
=><letter> 
=><letter> 
=><letter> 
=><letter> 


=> <letter> 


<id> => <letter> <rest> 
=><letter> <digit> <rest> 
=> P <digit> <rest> 
= P, <rest> 
=> P| 


<rest> 

<letter> <rest> 

<letter> <letter> <rest> 

<letter> <letter> <letter> <rest> 
<letter> <letter> <letter> <digit> <rest> 


<letter> <letter> <letter> <digit> <digit> <rest> 


=> e <letter> <letter> <letter> <digit> <digit> <rest> 


=> em <letter> <letter> <digit> <digit> <rest> 


=> emp <letter> <digit> <digit> <rest> 


=> emp _ <digit> <digit> <rest> 


=> emp_\ <digit> <rest> 


=> emp_12 <rest> 


=> emp_12. 


The language of the above grammar is L = {P|,emp_12,_count,i,....}. 


Thus the topic of context-free languages (language defined by context-free grammar) is 
perhaps the most important aspect of formal language. It is applied in defining programming 
languages, in formalising the notion of parsing, for simplifying translation of programming 
languages, in various string-processing applications and in the construction of efficient 


compilers. 
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9.2 Ways to Use a Grammar 


There are two ways to use a grammar. 


m To generate a string of the language. This is easy to do. Start with Start symbol and 
apply derivation steps, until a string (composed entirely of terminals) is obtained. 


EXAMPLE 9.2.1: Constructions of identifiers in Pascal. 
m To recognise strings, i.e., to test whether they belong to the language. 


EXAMPLE 9.2.2: An automaton to recognise whether the given string is a palindrome, 
over & = {a,b}. The grammar used is: 


S — aSa 
S — bSb 


Soe. 


Regular Language Context-free Language 
e It is the language that is described by | e It is the language that is defined by 
regular expression. context-free grammar. 
e For regular languages, the corres- | e For context-free languages, the corres- 
ponding acceptor is finite automaton. ponding acceptor is push-down automaton. 


e Regular languages are closed under | e Context-free languages are closed under 
union, product, kleene star, intersection | union, product and kleene star. 

and complement. 

e Regular languages are represented using | e All non-regular languages are represented 


regular expressions, in FA. using context-free grammars. 
e They are used in text editors, sequential | e They are used in programming languages, 
circuits etc. statements and compilers. 


Table 9.1 Comparison of regular and context-free languages 


9.3 Structure of Grammar 


Grammars not only produce natural languages, but also formal ones. If L is a language over 
an alphabet A, then a grammar for L consists of a set of rules of the form: 
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x -—> Y 
Non-terminal Production ‘Terminals 
(Variables ) Rules 
Figure 9.2. Structure of grammar. 


where, x and y denote strings of symbols taken from A and from a set of grammar symbols, 
disjoint from A. 


9.3.1 Production rule 


The grammar rule x — y is called a production rule. 


EXAMPLE 9.3.1: <id>— <letter> <rest> is a production rule. 


9.3.2 Start symbol 


Every grammar has a special grammar symbol called the start symbol. 


EXAMPLE 9.3.2: 
@ <sentence>—> <noun-phrase> <predicate> 
@ <id>— <letters> <rest> 
Here <sentence>, <id> are called start symbols. 


For a grammar, there must be at least one production with left side, consisting of only the 
start symbol. 


Note-1: If S is the start symbol for a grammar, then there must be at least one 
production of the form S > Y. 


9.3.3. Non-terminals and Terminals 


a. Non-terminals — The symbols that may be replaced by other symbols are called 
non-terminals or variables. 

b. Terminals — The symbols that cannot be replaced by other symbols are called 
terminals. 
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EXAMPLE 9.3.3: (Non—terminals and terminals) 

@ <sentence>—> <noun-phrase> <predicate> 
<noun-phrase>— <article> <noun> 
<predicate>—> <verb> . 

Non-terminals — <noun-phrase> and <predicate>, 
Terminals — <verb>, <noun> and <article> 


@ <id>—><letter> <rest> 
<rest>—> <letter> <rest> | <digit><rest>. 


<letter>——> al|b|....|z., <digit>——> 0|1|....|9 
Non-terminals — <rest> <digit> <letter> 
Terminals —a|b|....|z.,O|1|....]9. 


a IfA = {a,b,c}, then grammar for the language A is 
SE 
S— aS 
S— bS 


Non-terminals — § 
Terminals — a, b. 


c. Grammar rule can be of the form: 
One non-terminal —> string of non terminal, or, 
One non-terminal —> choice of terminals. 
EXAMPLE 9.3.4: 


@ <rest>—> <letter> <rest> | <letter> <digit> | € 
id —> <letter> <rest> 

@ S—>as 
S — alblc 


d. Convention 


@ Terminals will typically be smaller case letters. 
m Non-terminals will typically be upper case letters. 


e. € is neither a non-terminal (since it cannot be replaced with something else) nor a 
terminal (since it disappears from the string). 


9.3.4. Productions 


Productions refer to the set of rules, used to construct the valid sentences from the given 
grammar. 
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EXAMPLE 9.3.5: (Productions) 


@ id —~><letter> <rest> 
@ <sentence>—> <noun-phrase> <predicate> 
@ S — aSa 

S — bSb 

S—> a. 


9.3.5 Forms of production 


Production can be 


a. Unit-production — Any production of the form A — B is a unit production. 


EXAMPLE 9.3.6: 
(i) S—> AaB 
(ii) B— Albb 


Here, S — Band B — A are unit production. S — Aa, B — bb, are non-unit 
productions. 


b. ¢€-production — Any production of the form A — € is called €-production. 


EXAMPLE 9.3.7: 


(i) <rest>—> <letter> <rest> | <letter> <digit> | € 
(ii) A— aaleé 


c. Recursive production — A production is called recursive if its left side occurs on its 
right side, or if non-terminals are present on both sides of ‘>’. 


EXAMPLE 9.3.8: 


Gi) S—> aS 
(ii) <rest>—> <letter> <rest> 


d. Indirectly recursive production — A production, which is not directly recursive, is an 
indirectly recursive production. 


EXAMPLE 9.3.9: Consider the production rules of the form 
S — biaA 
A— c|bS. 


313 


Downloaded from https://www.cambridge.org/core. Stockholm University Library, on 06 Dec 2018 at 08:03:03, subject to the Cambridge Core terms of use, available at 
https://www.cambridge.org/core/terms. https://doi.org/10.1017/UP09788175968363.010 


A Textbook on Automata Theory 


Here, 


a S>aA 
=> abS 
g@A>DS 
=> baA. 


Hence, S — aA and A — bs are indirectly recursive. 


9.3.6 Backus Normal Form (BNF) 


The compact notation, used to represent the production rule, is called BNF. 


EXAMPLE 9.3.10: (BNF) 


(i) <rest>—> <letter> <rest> 
<rest> —> <digits> <rest> 
<rest>—>€ 


These can be written using BNF as 


<rest>—> <letter> <rest> | <digits> <rest> | € 
(ii) S—> SS 

Sa 

Se. 


BNF: S > SSl|a| é. 


We discuss the extended BNF in the next chapter, in section 10.9.2. 


9.3.7 Derivation 


The sequence of substitutions, used to obtain a string, is called a derivation. Derivation 
refers to replacing an instance of a given string’s non-terminal, by the right-hand side of the 
production rule, whose left-hand side contains the non-terminal to be replaced. 


If x and y are sentential forms (see section 9.9.4) and a — 8 is a production, then the 
replacement of a by B in xcry is called a derivation and denoted by: 
S—> xay (Production rule). 
a—-> Bp 


xay => xBy. (Derivation) 
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Derivation produces a new string from a given string. Therefore derivation can be used 
repeatedly to obtain a new string from a given string. If the string obtained, as a result of 
the derivation, contains only terminal symbols, then no further derivations are possible. 


— Used in the Statement of Productions. 


Note=2)) Two Qpes ob arrears: = Used in the derivation of word. 


EXAMPLE 9.3.11: (Derivation) 


m@ Consider the production rule: 


S — aSb 
S-eE 
Derivation for ab: 
S => aSb 
S>aeb 
S => ab 
@ Consider the productions: 
S — Ab 
A — aAb 
A-eE 
Derivation for aabbb: 
S => Ab 
=> aAbb 
= aaAbbb 
=> aabbb 


@ id —><letter> <rest> 
<rest>—> <letter> <rest> | <digit> <rest> | € 
<letter>——> alb|....|z 
<digit>—> 0|...|9 


Derivation for a0: ; 
id > <letter><rest> 


=> <letter> <digit><rest> 
=>a0e 
=> a0 
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9.3.8 Forms of derivation 


Derivation can be Leftmost or Rightmost derivation. 


@ Leftmost derivation — A derivation is said to be leftmost if in each step, the leftmost 


variable in the sentential form is replaced. 
@ Rightmost derivation —A derivation is said to be rightmost, if in each step, the rightmost 


variable in the sentential form is replaced. 


EXAMPLE 9.3.12: (Leftmost and rightmost derivation) 
Gi) Consider the production rules: 
S — AB 
A-—aaA 
A-eE 
B— Bb 
Boe 
Leftmost derivation for aab 
S = AB 
=> aaAB 
=> aaB 
= aaBb 
=> aab 
Rightmost derivation for aab 
S = AB 
=> ABb 
=> Ab 
=> aaAb 
=> aab 
(ii) Consider the production rules: 
S — aAB 
A — bBb 
Bo Ale 
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Leftmost derivation for abbbb 
S = aAB 
=> abBbB 
=> abAbB 
=> abbBbbB 
=> abbbbB 
=> abbbb 
Rightmost derivation for abbbb 
S => aAB 
=> aA 
=> abBb 
=> abAb 
=> abbBbb 
=> abbbb 


9.3.9 The ==> notion in derivation 


Any derivation involves the application of production rules. If the production rule is applied 
once, then we write: 


Wi => W2. 
If the production rule is applied more than once, then we write: 
* 
Wi => Wr, 


which means w, => w2 > w3 >>... => Wp. 


EXAMPLE 9.3.13: (=> notion in derivation) 


(i) Consider the productions: 


S — aSb 


Se 


Derivation for aaabbb: 
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We have 
S => aSb => aaSbb => aaaSbbb => aaabbb 


Instead, we write this as S => aaabbb. 


(ii) For the production rules: 
S — aSb 
Se 
We say 


Sse (S56) 
S=—sab (S>aSb>ac€éb=ab) 
S=Saabb (S = aSb => aaSbb = aabb) 


S=aaSbb (S = aSb => aaSbb). 


(iii) Consider 


S— Ab 
A — aAb 
Ave 
Derivation for aaaaabbbbbb: 
S = Ab => aAbb => aaAbbb 
=> aaaAbbbb 
=> aaaaAbbbbb 
=> aaaaaAbbbbbb 
=> aaaaabbbbbb. 
Instead, we write this as, 
S => aaaaabbbbbb 
or S= a"b"b. 


We further discuss derivation tree in detail in section 9.9. 
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9.4 Formal Definition of Context-Free Grammars 


A context free grammar is a method for (recursively) describing the grammar of a given 
language. A CFG is a set of variables, each of which represents a language. The language, 
represented by the variables, is described in terms of primitive symbols called terminals. 
The rules relating to the variables are called productions. 


A CFG, G, is formally defined as a 4-tuple 


G=(V,T,P,S), 


where V—A finite set of Variables or Non-terminals. 
T—A finite set of terminals. 
P—A finite set of production rules. 
S—Start symbol, S € V. 


Each production is of the form A — a, where A is a variable, a may be either terminal or 
non-terminal, i.e, A ¢ V anda € (VUT)*. 


EXAMPLE 9.4.1: (Context-free grammar) 
a. G= ({s}, {a,b}, P,S), where P contains 
P={S-— aSa, 
S — bsb, 
Soe 
}. 
b. G= ({id, Letter, rest, digit}, {a..z,0..9}, P, id), P is 
P ={<id>— <letter> <rest>, 
<rest>— <letter> <rest> | <digit> <rest> | €, 
<letter>— a\b|...|z, 


<digit>— O|1|...|9 


}. 
c. G= ((S,A), {a,b}, P,S) where P is 
S — aAS|a 
A — SbA|SS|ba. 
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9.5 Types of Grammar 


In this section, we discuss some of the commonly used grammars, and give the complete 
classification of grammar in chapter 15. 


9.5.1 Linear and Nonlinear Grammars 


A grammar, with at most one variable (non-terminal) at the right side of a production, is a 
linear grammar, otherwise it is nonlinear. 


EXAMPLE 9.5.1: (Linear grammar) 


(i) S — aSb 
S->E 

(ii) S— Ab 
A — aAb 
Ae 

(iii) ‘S—>A 
A— aBle 
B-— Ab. 


EXAMPLE 9.5.2: (Nonlinear grammar) 


(i) S—> SS 
S-eE 
S — aSb 
S — bSa. 
Gi) S—> aA 
A-eé 
S — aSAS. 


9.5.2 Right and Left-Linear Grammars 


Linear grammars are further classified into: 


a. Right-linear grammar 
A grammar G = {V,T,S, P} is right-linear, if all productions are of the form 
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A-— xB 
or, A>Xx 
where A,B € V andx € T*. 


EXAMPLE 9.5.3: (Right-linear grammar) 
(i) S— abS 
Sa. 
(ii) SOA 
A—>aA 
A>é. 
b. Left-linear grammar 
A grammar G = {V,T,S, P} is left-linear, if all productions are of form 
A— Bx 
or, Ax 
where A, B, € V and x € T*. 


EXAMPLE 9.5.4: (Left-linear grammar) 
(i) S — Aab 

A — Aab|B 

Boa 
(ii) S— Ab 

S — Sb 

Ave. 


9.5.3 Regular Grammar 


A grammar G = {V,7,S, P} is said to be regular, if it is either right-linear or left-linear. 


EXAMPLE 9.5.5: (Regular grammar) 


(i) S— abS 
Sa 

(ii) S — Aab 
A — Aab|B 


Boa 
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9.5.4 Non-regular Grammar 


A grammar G = {V,T,S, P} is non-regular, if it is neither right-Linear nor left-Linear. 


EXAMPLE 9.5.6: (Non-regular grammar) 


(i) S — aSb 
SE 
(ii) S — Ab 
A— aAb 
Ave 
(iti) S mE 
S — aSb 
S — bsa 


9.5.5 Simple or S-grammar 
A grammar G = (V,T,S, P) is said to be a simple grammar or S-grammar, if all productions 
are of the form 


A— ax, 


where A € V,a € T and x € V™ and any pair (A, a) occurs at most once in P. 


EXAMPLE 9.5.7: S-grammar 
(i) S — aS|bSS|c is a simple grammar, since pairs (S,a), (S,b) occur only once in 
production. 
(ii) S — aAS|a 
A — SDA. 
Pair (S,a) occurs only once in production and is a S-grammar. 


(iii) S — aS|bSS|aSS|c is not a simple grammar, since pair (a,S) occurs twice in 
production as aS and aSS. 


9.5.6 Recursive grammar 


A grammar G = (V,T,S, P) is recursive, if it contains either a recursive production or an 
indirectly recursive production. 
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EXAMPLE 9.5.8: Recursive grammar 
(i) S—> aS 
(ii) S—> SS 
SE 
S — aSb 
(iii) S — blaA 
A— clbS. 


9.6 Language of a Context-Free Grammar 


Definition: The language generated (defined, derived, produced) by a CFG, is the set of 
all strings of terminals, that can be produced from the start symbol S$ using the productions 
as substitutions. A language generated by a CFG, G is called a context-free language (CFL) 
and is denoted by L(G). 


Thus, if G is a CFG, with S as a start symbol and set of terminals T, then the language 
of G is the set defined as 
L(G) = {wlw € T* and S => w}. 


We discuss the context free language and its properties in detail in chapter 15. 


EXAMPLE 9.6.1: For the given string sets, 


a. what is the language accepted by CFG 
b. what grammar is required to derive these strings 
c. what is the regular expression? 
(i) {€,a,aa,....,a",....}. 
a L(CFG) = {a"|n > 0}. 
m The CFG to derive these strings is 
P={ 


S-eE 
S— aS. 


} 
where S is the start symbol, V = {S} and T = {a}. 
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m RE =a* 
Derivation for the string aaa : 
S = aS => aaS => aaaS => aaa. 
. * 
ie., S ==> aaa. 
(ii) {€,ab,aabb,....,a"b",....} 


m L(CFG) = {a"b"|n > 0}. 

@ CFG to derive the above strings: 
Here, any string in this language is either € or of the form axb for some string x 
in the language. The following grammar will derive any of the strings. 


P = 
S-eE 
S — aSb. 
} 
where V = {S}, T = {a,b} and S = {S$}. 


@ RE = (ab)* 
Derivation for the string aaabbb: 


S = aSb => aaSbb => aaaSbbb => aaabbb. 
i.e. S ==> aaabbb. 
(iii) {€,ab, abab,....(ab)",....} 


gm L(CFG) = {(ab)"|n > 0}. 
m Only string in this language is either € or of the form abx, for some string x in 
the language. The grammar is: 


P={ 


S-eE 
S — abS. 


} 


where V = {S}, T = {a, b} and S = {5S}. 
w RE = (ab+ab)* 
Derivation for the string ababab: 


S = abS => ababS => abababS = ababab. 
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(iv) {a,aab, aabab,....} 


@ L(CFG) = {(ab)" -a\ln > O}. 
@ Grammar: 


where V = {S}, T = {a,b} and S = {5S}. 
m RE =(ab)*-a 
(v) {€,aa,bb,abba,...... }. 
mw L(CFG) = {(ab)"|n > 0 such that w ¢ aba}. 
@ Grammar: 
P={ 


S — aSa, 
S — bSb, 
S-eE 


} 
where, V = {S}, T = {a,b} and S = {S}. 


w RE =(a+b)*-(a+b)* 
Derivation for the string abba: 


S = aSa => abSba => abba. 
ie. S => abba. 


(vi) {a,b,aa,bb,...} 


g L(CFG) = {(ab)"|n > I}. 
@ Grammar: 


P = {S > aS|bS|a|b} where V = {S},T = {a, bJandS = {5}. 


mw RE=(a+b)t 
Derivation for the string abbab: 


S => aS => abS => abbS => abbaS => abbab. 


325 


Downloaded from https://www.cambridge.org/core. Stockholm University Library, on 06 Dec 2018 at 08:03:03, subject to the Cambridge Core terms of use, available at 
https://www.cambridge.org/core/terms. https://doi.org/10.1017/UP09788175968363.010 


A Textbook on Automata Theory 


(vii) L = {€, aaa, aaaaaa,...}. 
gm L(CFG) = {w:|w| mod 3 = 0} 
@ Grammar: 
P ={ 


S-eE 
S — aaaS 


} 


where V = {S}, T = {a}and S = {5}. 
m RE = ((a+b)(a+b)(a+b))* 


9.7 Operations on Production Rules 


Suppose M and N are languages, whose grammars have disjoint sets of non-terminals. Also, 
assume that start symbols, for the grammars of M and N, are A and B respectively. Then, 
following are the rules to find new grammars generated from M and N: 


w Union Rule: The language M U N starts with two productions, 
S—AI|B. 
@ Product Rule: The language MN starts with the productions 
S — AB. 
m Closure Rule: The language M* starts with the productions 
S— AS|e€. 
EXAMPLE 9.7.1: (Operations on production rules) 
a. Union Rule 
If L={é,a,b,aa,bb,...,a",b",...} with 


RE = (a+b)*, 
L can be written as the union of M and N 


ie L=MUN 


where, M = {a"|n > 0} and N = {b"|n > O}. 
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b. 


The grammar for L is. 


S— A|B (union rule). 
A—eé|aA (grammar for M), 
B—>eé|bB (grammar forN). 
Product Rule 
If L = {€, ab, aabb, aaaabbbb, aabbb .. .} with RE = (ab)* i.e. L = {a"b" |m, n = 0}, 


L can be written as a product 
L= MN. 


where M = {a™|m > 0} and N = {b"|n > O}. 

Thus grammar for L is: 
S— AB (product rule). 
A-—>é€|aA (grammar for M). 
B—eé|bB (grammar for N). 


. Closure Rule 


If L = {€,aa, bb, aabb, aaaabb, bbbb, aaaabbbb,...}., or, L = {aa,bb}* with 
RE = (aa)*.(bb)* and if some M = {aa, bb}, then L = M*. 
The grammar for L is: 


S— AS|€ (closure rule). 


A-— aa\|bb (grammar for M). 
This grammar can also be written as (by substituting A). 


S — aaS\|bbS| €. 


9.8 Design of a CFG 


The basic design strategy for CFG is as follows: 


a. 


b. 
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Understand the language specification by listing the examples of the strings in the 
language. 
Determine the “base case’ productions of CFG by listing the shortest strings in the 
language. 
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c. Identify the rules for combining smaller sentences into larger ones. 

d. Test the CFG obtained on a number of carefully chosen examples. All of the base 
cases should be tested, along with all of the alternative productions. Also, whether the 
grammar is consistent with the strings listed in step a or not is checked. 


EXAMPLE 9.8.1: Obtain a CFG for the language of palindrome over the alphabet £ = 
{a, b,c}. 
Solution: The recursive definition of a palindrome is, 


a. € is a palindrome. 
b. a,band c are palindromes. 
c. If w is a palindrome then the strings awa, bwb and cwc are palindromes. 


Let the CFG, G = (V,7,P,S), where: 


V = {5S} 

T = {a,b,c} 

P={ 
S — aSa 
S — bSb 
S —> cSC 


S — albic| € [By definition 1 and 2] 


} 
S is the start symbol. 


@ Derivation for the string abcba, which is a palindrome, is shown below: 
S = aSa = abSba = abcba. 
@ Derivation for the string cbaabc, which is a palindrome, is shown below: 
S = cSc = cbSbe = cbaSabe = cbaabc. 


The language of CFG is L(G) = {w® = w\w € L}. 


EXAMPLE 9.8.2: Obtain a CFG for the language of even palindrome, over the alphabet 
x = {a,b}. 


Solution: Let the CFG, G = (V,T, P,S), where: 
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V = {S} 
T = {a, b} 
P={ 
S — aSa 
S — bSb 
S-E 
} 


S is the start symbol. 
m Derivation for the string abba, which is an even palindrome, is shown below: 


S = aSa => abSba => ab € ba => abba. 


EXAMPLE 9.8.3: Obtain a CFG for the language of odd palindrome over the alphabet 


x = {a,b}. 
Solution: Let the CFG, G = (V,T,P,S), where: 
V = {S} 
T = {a,b} 
P={ 
S — aSa 
S — bSb 
S— alb 
} 


S is the start symbol. 
Derivation for the string aaa which is an odd palindrome is 
S = aSa => aaa. 
EXAMPLE 9.8.4: Construct a CFG to generate the set of all balanced parenthesis over the 
alphabet & = {(, )}. 


Solution: Set of all balanced parenthesis over {(, )} is recursively defined as 
a. € and () are balanced. 
b. If w is balanced, so is (w). 
c. If wand x are balanced, so is wx. 
d. Nothing else is balanced. 
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Let the CFG, G = (V,T, P,S), where: 


V = {S} 
T = {G)} 
P={ 


Se |0 [Definition-a] 
S — (S) (Definition-b] 
S— SS [Definition-c] 


} 
S is the start symbol. 


Derivation for ((( ))), which is a balanced parenthesis is shown below: 
S => (S) = ((S)) = (((S))) > ((Q)). 
EXAMPLE 9.8.5: Obtain a CFG to generate a language of all non-palindrome over the 
alphabet & = {a, 5}. 
Solution: The following cases are considered: 


Generate palindromes on both left and right side. 

Generate a non-palindrome. 

Make one non-terminal to generate any number of a’s and b’s. 
Generate any combination of a’s and b’s. 


Let the CFG, G = (V,T,P,S), where 


ao SP 


V ={S,A,B} 
T = {a,b} 
P={ 
S—> aSa\bSb__——[Byal 
SA [By b] 


A — aBb\|bBa [By c] 
B— aB\bB| € [By d] 


} 
S is the start symbol. 


Derivation for the string abbaab, which is not a palindrome, is shown below: 


S => A= aBb = abBb = abbBb => abbaBb = abbaaBb => abbaab. 
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EXAMPLE 9.8.6: Construct a CFG to generate a restricted class of arithmetic expressions 
on integers. 


Solution: An arithmetic expression AE can be recursively defined as follows: 


a. An expression AE can be an identifier. 
b. If AE is any arithmetic expression, then 


mw AE+AE 
AE — AE 
AE x AE 
AE|AE 
AE \ AE 
(AE) 


are all arithmetic expressions. 


Consider a set of operators {+, —, *, /, A} and an identifier (I), which can start with any of 
the letters from {a, b,c}. Then 


I —> Ia\Ib\Icla|b\c. 
Let the CFG, G = (V,T,P,S), 


where V = {AE,]J} 
T = {+, —,*,/,A,a,b,c} 
P={ 
AE I 
AE > AE+AE 
AE > AE—AE 
AE — AE * AE 
AE — AE\AE 
AE — AE A AE 
AE — (AE) 
I > la|Ib\Ic\a\b\c. 
} 
AE is the start symbol. 
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Derivation to generate the arithmetic expression a + (b — c)|a: 


AE => (AE) 

=> (AE + AE) 

=> (AE + AE|AE) 

=> (AE + (AE)|AE) 

=> (AE + (AE — AE)|AE) 

>U4+d-DI) 

=> (a+ (b—c)la) 
EXAMPLE 9.8.7: Find the grammar for the language of decimal numerals by observing that 
a decimal numeral is either a digit or a digit followed by decimal numeral. 
Solution: A number N can be recursively defined as follows: 


a. A number is a digit (Digit). 
b. A number followed by a digit and vice-versa is also a number. 
N — Digit 
N > N Digit\|Digit N. 
Let S = {+, —, €} denote the sign of a number, Digit = {0,1,...9} denote digits forming 
the number and an integer I, which can be a number(N) or the sign of a number(S) followed 


by the number and so on, i.e., 
I> N(SN. 


Let the CFG, G = (V,T,P,S), 


where V = {Digit,S,N,J} 
T = {+,-,0,1,...9} 


P={ 
I— N|SN 
N — Digit|NDigit|DigitN 
Sti] 
Digit > O|1|....|9. 
} 


I is the start symbol. 
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Derivation to generate the number 9540 is shown below. 


I>N 
=> NDigit 
=> NO 
=> NDigit0 
=> N40 
=> NDigit40 
=> N540 
=> Digit540 
=> 9540 


EXAMPLE 9.8.8: Obtain the CFG to generate the regular expression (011 + 1)*(01)*. 


Solution: The given RE (011 + 1)*(01)* is of the form A*B* where A is 011 or 1 and B is 
01. 


Let the CFG, G = (V,T,P,S), 


where V = {S,A,B} 


T = {0, 1} 
P={ 
S— AB 
A— OLIA|IA| € 
B-— O1B\ Ee. 
} 


S is the start symbol. 


EXAMPLE 9.8.9: Obtain the CFG for the regular expression (a + b)*aa(a + b)*. 


Solution: The given RE(a + b)*aa(a + b)* is of the form X*aaX*, where X is (a + b). 
Let the CFG, G = (V,T,P,5S), 


where V = {S,X} 
T = {a,b} 
P={ 
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S — Xaax 
X — aX|bX| € 


} 
S is the start symbol. 


Derivation for the string abbaaba is shown below: 


S = XaaX = aXaaX = abXaaX => abbaaX => abbaabX => abbaabaX => abbaaba. 
EXAMPLE 9.8.10: Find the CFG for the regular expression (a + b)*aa(a + b)* given X,Y 
and § are non-terminals. § is the start symbol with: 


a. X productions, producing words ending with a. 
b. Y productions, producing words starting with a. 


Solution: Let the CFG, G = (V,T,P,S), 


where V = {S,X,Y} 


T = {a,b} 
P={ 
S— XY 
X > aX\|bXx\a 
Y — YalYbla 
} 


S is the start symbol. 


EXAMPLE 9.8.11: Obtain the CFG to generate the language L = {w|ng(w) = np(w)}. 


Solution: The grammar to be generated should have number of a’s in the string w equal to 
the number of b’s in the string w. The following cases are considered: 


a. € (denotes) zero as and zero bs 

b. ‘a’ followed by the symbol ‘b’ 

c. b’s followed by the symbol ‘a’ 

d. string which starts and ends with the same symbol. 


Let the CFG, G = (V,T,P,5), 
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where V = {S} 
T = {a,b} 
P={ 
S-eE [By 1] 


S—+aSb  [By2] 
S—+bSa__[By3] 
S>SS  [By4] 


} 
S is the start symbol. 


Derivation for the string abba for ng(w) = np(w) is shown below: 


S => SS = aSbS => abS => abbSa => abba. 


EXAMPLE 9.8.12: Obtain the CFG to generate the language L = {w|ng(w)> np(w)}. 
Solution: The following cases are considered: 


€ (denotes) zero as and zero bs 

a followed by the symbol ‘b’ 

b followed by the symbol a 

string that starts and ends with same symbol 
to produce one extra ‘a’ than ‘b’ 

to insert as many ‘a’s’ as possible. 


Let the CFG, G = (V,7,P,5S), 
where V = {S,X,Y} 
T = {a,b} 
P={ 


S — XY|YX|XYX [By 6]. 
X — axb 

X — bXa 

X — XX 

x—-eE 

Y— aYla [By 5]. 


} 
S is the start symbol. 


mo ao & pS 
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EXAMPLE 9.8.13: Obtain the CFG to generate the language L = {w|ng (w) = 2np(w) 2 
Solution: The following cases are considered: 


a. € (denotes) zero as and zero bs 

b. ‘aa’ followed by the symbol ‘b’ 

c. ‘b’ followed by the symbols ‘aa’ 

d. string that starts and ends with same symbol. 


Let the CFG, G = (V,T,P,S), 


where V = {S} 

T = {a,b} 

P={ 
S->eE 
S — aaSb 
S — bSaa 
S— SS 
} 


S is the start symbol. 


EXAMPLE 9.8.14: Obtain a CFG to generate the language L = {w|na(w) 4 np(w)}. 
Solution: Let the CFG, G = (V,T,P,S), 


where V = {S,A, B} 

T = {a,b} 

P={ 
S—> A|B 
A — alAa|aA|bAA|AbA|AAb 
B — b|Bb|bB|aBB|BaB|BBa 

} 
S is the start symbol. 


EXAMPLE 9.8.15: Obtain a grammar to generate the language L = {ww*|w € {a,b}*}, 
where w* is the reverse of w. 
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Solution: Let the CFG, G = (V,T,P,S), 


where V = {S} 
T = {a,b} 
P={ 
S — aSa 
S — bSb 
SE 
} 


S is the start symbol. 


EXAMPLE 9.8.16: Obtain a grammar to generate the language L = {wew*|w € {a, b}*}. 
Solution: Let the CFG, G = (V,T,P,S), 


where V = {S} 
T = {a,b} 
P={ 
S — aSa 
S — bSb 
S->C 
} 


S is the start symbol. 


EXAMPLE 9.8.17: Obtain a CFG to generate the language L = {0"1"|n > O}. 
Solution: Let the CFG, G = (V,T,P,S), 


where V = {S} 
T = {0,1} 
P={ 
S— 0S1 
SE 
} 


S is the start symbol. 
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To derive 0”1”", we apply the first production n — 1 times, followed by an application of 
second production. This gives 


S=> 051 
=> 00511 
=> 0°s1? 
Si alk 
=> o's)! 
=> 071" 


EXAMPLE 9.8.18: Construct a CFG to generate the language L{a"b"|n > 1}. 


Solution: Since n > 1, € production are not considered. Let the CFG, G = (V,T,P,S), 


where V = {5S} 
T = {a,b} 
P={ 
S — aSbb 
S — abb 
} 


S is the start symbol. 


To derive a"b*", we apply the first production n — 1 times, followed by an application of 
second production. This gives 


S = aSbb 
=> aaSbbbb 
=> a’*sb* 
= g'—|sp20-) 


=> a"b*, 
EXAMPLE 9.8.19: Obtain a grammar to generate the language L = (oye In > O} 
Solution: Let the CFG, G = (V,T,P,S), 
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where V = {S,A} 


T = {0,1} 
P={ 
S— Al 
A-— OA1 
A-e. 
} 


S is the start symbol. 


To derive 0"1"*+!, we apply the second production n — 1 times, followed by an application 
of third production. This gives 


S=>Al 
S=>0Al1-1 
S=> 00A11-1 
S=>0O07Al?-1 
=> eee 
sorta"! .4 
=>0"1"-1 
=> ryt! 
EXAMPLE 9.8.20: Obtain the grammar to generate the language L = {a™b!"c"|m > 
1 and n > O}. 
Solution: Let the CFG, G = (V,T,P,S), 
where V = {S,A} 


T = {a,b,c} 
P={ 
S — AlSc 
A — ab|aAb 
} 


S is the start symbol. 
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The derivation for abc” is as follows: 
S => Sc 
=> Sec 
=> Acc 
=> aAbcc 
=> aaAbbcc 
=> a’ Ab’? 
=> aap!" 


=>a™b"c". 


EXAMPLE 9.8.21: Obtain a CFG to generate L = {ab(bbaa)"bba(ba)"|n > 0}. 
Solution: Let the CFG, G = (V,T,P,S), 
where V = {S,A} 


T = {a,b} 
P={ 
S—abA 
A — bbaaAba\bba 
} 


S is the start symbol. 
EXAMPLE 9.8.22: Obtain a grammar to generate the language L = {ab"|m #4 n,m > 
On > O}. 
Solution: Let the CFG, G = (V,T,P,S), 
where V = {S,A,B,C} 
T = {a,b} 
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P={ 


S — aSb 
S—-A 
S—>B 

A — aAljAala 
B — bB\Bb\b 
} 


S is the start symbol. 


EXAMPLE 9.8.23: Obtain a grammar to generate the language L = {w : |w| mod 3 = 0} 
over & = {a}. 


Solution: We need to generate the grammar for the language 
L = {€, aaa, aaaaaa, ...}. 


In other words, any string w generated should have the length, equal to a multiple of 3. 
Let the CFG, G = (V,T, P,S), 


where V = {5} 
T = {a} 
P={ 
S-eE 
S — aaaS 
} 


S is the start symbol. 


EXAMPLE 9.8.24: Find a grammar for language L = {a”b"|m,n € N andn> m}. 
Solution: Let the CFG, G = (V,T,P,S), 
where V = {S,A} 


T = {a,b} 
P={ 
S — aSb|aAb 
A — bA\|b. 
} 


S is the start symbol. 
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EXAMPLE 9.8.25: Find a grammar of the language L = {a"bc"|n € N} 
Solution: Let the CFG, G = (V,T,P,S), 


where V = {S,A} 


T = {a,b,c} 
P={ 
SE 
S— abA 
A— cS|c 
} 


S is the start symbol. 


EXAMPLE 9.8.26: Find the grammar for language over & = {a, b}, in which all words are 
of the form a* bY a7. Here, X,Y,Z = 1,2,3,... and ¥ = 5X +72. 


Solution: Let the CFG, G = (V,T,P,S), 


where V = {S,A,B} 


T = {a,b} 

P={ 
S— AB 
A — aAb?| € 
B— b’Baleé 
} 


S is the start symbol. 


EXAMPLE 9.8.27: Obtain a grammar to generate the following over 2 = {a, b}. 
a. Set of all strings with exactly one a. 


b. Set of all strings with atleast one a. 


Solution: 
a. G=(V,T,P,S) where V = {S,A}, T = {a,b}, S = {S} and 
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P ={ 
S — bS|aA 
A— bA|é 
} 
b. G=(V,T,P,S) where V = {S,A}, T = {a,b}, S = (S} and 


P ={ 


S — bS|aA 
A-— aA|bA| € 


, 


9.9 Derivation Tree 


When deriving a string w from S, if every derivation is considered to be a step in the tree 
construction, then the graphical display of the derivation of string w results in a tree structure. 
This is called a derivation tree or parse tree or generation tree or production tree. 

Thus a derivation tree is the display of derivations, as a tree. A tree is said to be a derivation 
tree if it satisfies the following requirements: 


All leaf nodes of the tree are labelled by terminals of the grammar. 

The root of the tree is labelled with start symbol of the grammar. 

The interior nodes are labelled using non-terminals. 

If an interior node has a label A, and it has n descendents with labels xj, x2,...,X, from 
left to right, then the production rule A — x)x2x3...,X, must exist in the grammar. 


aoe f 


The derivation tree is useful to display the derivations as trees. A structure of trees for. 
the words of a language is useful in applications such as the compilation of programming 
languages. 


9.9.1 Formal Definition 
Let G = (V,T,P,S) be a CFG. A tree is a derivation tree for G, if and only if, 
a. every vertex has a label, which is a symbol of terminal (T), non-terminal (V) or the 
null string €, 
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b. the root of the tree has start symbol of the grammar as its label (S), 

if a vertex is interior and has a label A, then A € V, 

d. if A is a label and A > x1,x2,...,X, is the production, then the production rule 
A — x1%2x3...,Xn, must exist in the grammar. 


2 


EXAMPLE 9.9.1: (Derivation tree) 
a. Consider G = ({S, A}, {a, b}, P, S), where P is, 


S — aAS|a 
A — SDbA|SS|ba. 


A derivation tree of G, to obtain the string w = aabbaa, is given by: 


Figure 9.3. Parse tree for the string w = aabbaa. 


9.9.2 Yield of a derivation tree 


The yield of a tree is the string of symbols obtained by only reading the leaves of the tree 
from left to right, without considering the €-symbols. The yield is always derived from the 
root and is always a terminal string. 


EXAMPLE: The yield of the tree in figure 9.3 is aabbaa. 


9.9.3 Subtree of a derivation tree 


A subtree of a derivation tree is a particular vertex of the tree, together with all its descendants, 
the edges connecting them and their labels. It looks just like a derivation tree, except that 
the label of the root may not be the start symbol of the grammar. If the variable A labels the 
root, then we call the subtree an A-tree. Thus ‘S-tree’ is a synonym for ‘derivation tree’, if 
S is the start symbol. 
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EXAMPLE 9.9.2: (Subtree) 
For the derivation tree of G in figure 9.3 the subtree is 


Pa 
Ss b A 


> 


Figure 9.4. Subtree of Figure 9.8 
The label of the root of this subtree is A and hence it’s an A-tree. 


EXAMPLE 9.9.3: For the grammar G with production rules 
E>E+E 
E> ExE 
E = id. 


where V = {E},7 = {id}, obtain the derivation and the derivation tree for the string 
w = id + id x id. 


Solution: Derivation for w: 

E=>E+E. 
>E+ExE. 
=> id+ExE. 
=> id+idx*E. 
= id + id x id. 


Figure 9.5. Derivation tree for id + id * id. 
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EXAMPLE 9.9.4: For G = ({S,A}, {a,b}, S, P), where P is, 


S — aAS\a 
A — SbA|SS\|ba. 
Find the left most and the rightmost derivations for the string aabbaa. Also, draw the parse 
tree. 
Solution: Leftmost derivation of aabbaa: 
S 
S = aAS 
= aSbAS Zs, 
=> aabAS wl Se 
= aabbaS Ss b A a 


= aabbaa | 
a b a 


Figure 9.6 Parse tree for leftmost derivation of 
the string aabbaa 


Rightmost derivation of aabbaa: 
S 


S => aAS PAn 
=> aAa a A S 

=> aSbAa Abo io 

S b A a 


=> aSbbaa 


=> aabbaa | f« 


a b a 


Figure 9.7 Parse tree for rightmost derivation of the 
string aabbaa. 


Note-3: Parse tree for the left and the rightmost derivations is same in the above example. 


EXAMPLE 9.9.5: Let G be a grammar with P, given by: 


S — aB|bA 
A — alaS|bAA 
B — b|bS|aBB. 
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For the string w = aaabbabbba, find the left and the rightmost derivations and also draw 
the parse tree. 


Solution: Leftmost derivation 


S =>aB 
=> aaBB 
= aaaBBB ors 
= aaabBB B B 
=> aaabbSB /\ A 


=> aaabbaBB a BBbS 
=> aaabbabB / A A 
=> aaabbabbS 

= aaabbabbbA ae a 
=> aaabbabbba \ 


Figure 9.8 Parse tree for leftmost derivation of the string w 


Rightmost derivation: 
lan 

S => aB a B 
= aaBB AN, 
=> aaBbS a B B 
=> aaBbbA VAN A 
=> aaBbba a BBbSs 
=> aaaBBbba /| \ \ 
=> aaaBbbba b S bbA 
=> aaabSbbba A 
= aaabbAbbba bA a 


= aaabbabbba 


Es) 


Figure 9.9 Parse tree for Rightmost derivation of the string w. 
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EXAMPLE 9.9.6: Obtain the left most and the rightmost derivations and parse tree for the 


grammar, whose production rules are: 


E> E+E 
E> Ex*E 
E — id, 


given the string for derivation as w = id + id x id. 


Solution: Leftmost derivation 


E 
ES>E+E 
= id +E a 
>id+ExE 
=> id+id*E | ws 
: : : id E * E 
=> id + id x id | | 
id id 


Figure 9.10 Parse tree for leftmost derivation of the string w 


Rightmost derivation 


E >ExE maw 


=>E+Exid | 
=> E+ id *id “oN Ms 
=> id + id x id | | 

id id 


Figure 9.11 Parse tree for rightmost derivation of the string w 


EXAMPLE 9.9.7: Given the following grammar: 
S — S[S]| €, 
construct a leftmost derivation, a rightmost derivation and a parse tree for each of the 
following strings: 
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a. [] 
Solution: Leftmost derivation: 
S 
S => S{S] 
=> [S] s [ Ss ] 
>[] | 
E € 


Figure 9.12 Parse tree for leftmost derivation of [ ] 


Rightmost derivation: 


S 
S => S[S] 
S => S{] s [ Ss ] 
Ss =>[] | 

E€ E 


Figure 9.13 Parse tree for rightmost dervation of [ ] 


b. [0] 


Solution: Leftmost derivation: 


eee. Jw 


S => [S] sl s ] 
= (StS IN 
= ([S] egi §) 
= (1 | | 

€ E 


Figure 9.14 Parse tree for leftmost derivation of [[ ]] 


Rightmost derivation: 


Ny 

S = S[S] 

S = S[S[S]] sl s ] 
=> S{[]] = s I S$ ] 
> [(]] 

i E 


Figure 9.15 Parse tree for rightmost derivation of [[ ]] 
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9.9.4 Sentential form 


A string, made up of terminals and non-terminals, is called a sentential form (a sentence 
that contains variables and terminals). 


EXAMPLE 9.9.8: (Sentential form) 


S=> aSb=> aaSbb=> aaaSbbb=> aaabbb 


\]7 


Sentential forms Sentence 


Formal definition: 


Let G = (V,T,P,S) be a CFG. Any sting w € (V UT)*, which is derivable from the 
start symbol S (denoted by S > a), is called a sentence or sentential from of G. 


m If there is a derivation of the form S ==>, where at each step in the derivation 
process only a leftmost variable is replaced, then @ is called left-sentential form. 


m If there is a derivation of the from S =>, where at each step in the derivation 
process only a rightmost variable is replaced, then a is called right-sentential form 


EXAMPLE 9.9.9: (Sentential form) 


For the grammar G = (V,7,P,S), where P is 
S — AB 
Sentential form A— aaA|é 
B— Bble 
the left and right sentential forms are: 
a. Left-sentential form: 
S = AB => aaAB => aaB => aaBb => aab. 
b. Right-sentential form: 
S = AB => ABb => Ab => aaAb => aab 
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9.10 Ambiguous and Unambiguous Grammars 


m ACFGis ambiguous, if for atleast one word in its CFL, there are two possible derivations 
of the word that correspond to two different syntax or parse trees. 

= A grammar is said to be ambiguous, if its language contains some string that has two 
different parse trees. 

m A grammar is said to be unambiguous, if its language has exactly one parse tree. 

m If grammar G is ambiguous, then for some w in L(G), there exists more than one parse 
tree. Hence, there is more than one leftmost order of derivation and equivalently, there 
is more than one rightmost order of derivation. 

m If grammar G is unambiguous, then for some w in L(G), there exists exactly one parse 
tree. Hence, there exists exactly one leftmost order of derivation and equivalently one 
rightmost order of derivation. 


Note 4; A grammar is ambiguous if there are more than one leftmost (or 
rightmost) derivation for given w. 


EXAMPLE 9.10.1: (Ambiguous and unambiguous grammar) 
a. Consider a grammar for arithmetic expressions: 
E—al\b 
E—>E-E. 


This grammar is ambiguous, because, to derive the string w = a — b — a, there exist 
two distinct parse trees, as shown in figure 9.21. 


(i) E (ii) BE 
fo ott | 


Figure 9.16. Two distinct parse trees for the derivations of string w = a — b — a. 


There are two distinct parse trees, which means that there are two distinct left or 
rightmost derivations. 
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Leftmost derivations: 
Gi) EXSE-E>a-E>sa-E-E>xa-b-E>a-b-a 
Gi) EsE-ExSE-E-Exsa-E-E>a-b-Ex>a-b-a 
The same is the case with rightmost derivation. 
b. Consider the grammar 


S— aSla where, V = (S}andT = {a}. 


This grammar is unambiguous, because to derive the string w = aa, there exists exactly 
one parse tree, as shown in figure 9.22. 


Figure 9.17. Parse tree to derive w = aa. 


There is, exactly one parse tree and hence, exactly one leftmost or rightmost derivation. 


Leftmost derivation: S > aS => aa. 
Rightmost derivation: § => aS => aa. 


EXAMPLE 9.10.2: Given below is the grammar for palindrome over alphabets {a, b}. Verify 
whether it is ambiguous or unambiguous. 


S — aSa|bSb\a|b| € . 
Solution: Consider w = babbab. 
Derivation: S => bSb => baSab => babSbab => babbab. 


b S b 
Zs 
a S a 
ors 
b S b 
| 
€ 


Figure 9.18. Parse tree of the string w. 
There is exactly one parse tree and hence the grammar is unambiguous. 
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EXAMPLE 9.10.3: Show that the grammar 
S — aS|aSb|x 
X —> Xalja 
where V = {S,X} and T = {a, b} is ambiguous. 


Solution: The word w = aa has two different leftmost derivations, that correspond to 
different parse trees. 


a. S=>aS > aX > aa. 


o —— ho 


Figure 9.19. First Parse tree to derive w = aa. 


b S3xX => Xaz= aa. 


—n 


x a 


a 


Figure 9.20. Second Parse tree to derive w = aa. 


The same is the case with rightmost derivation. Hence, the grammar is ambiguous. 


EXAMPLE 9.10.4: Prove that S — aSbS|bSaS| € is ambiguous. 


Solution: The word w = abab has two different leftmost derivations, that correspond to 
different parse trees. 


a. S = aSbS => abSaSbS => abaSbS => ababS = abab. 
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Figure 9.21. First Parse tree to derive w = abab. 


b. S => aSbS => abS => abaSbS => ababS => abab. 
S$ 


AAW 


| a>~ 
e as b $ 


€ 


p 


De 


Figure 9.22. Second Parse tree to derive w = abab. 
The same is the case with rightmost derivation. Hence, the grammar is ambiguous. 
EXAMPLE 9.10.5: Show that the given grammar G is ambiguous. 
E> E+E 


E> Ex*E 


E = id. 
Solution: The word w = id + id x id has two different leftmost derivations, that correspond 
to different parse trees. 


a ESE+ESid+ES idt+E*+ES id+id*E = id + id x id. 
U™®, 
i is 
| E * | 


id id 
Figure 9.23. First Parse tree to derive w = id + id * id. 
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b Ex ExES E+E*x ES id+ExE = id+id*E = id + id x id. 


E 
“IN 
E * E 
ws, | 
E + E id 
| | 
id id 


Figure 9.24. Second Parse tree to derive w = id + id x id. 


The same is the case with rightmost derivation. Hence, the grammar is ambiguous. 


9.11 Total Language Tree 


Definition: For a given CFG, the total language tree is the tree 
® with root S, 
m@ whose children are all the productions of S, 


™ whose second descendants are all the working strings, that can be constructed 
by applying one production to the leftmost non-terminal in each of the children 
and so on. 


EXAMPLE 9.11.1: For the grammar G = {V,7,P,5S}, V = {S,X},T = {a,b} 


with production rules 


S — aX|XalaXbXa 
X — ba|ab, 


the CFG has total language tree as shown in figure 9.25. 
The language of CFG i.e. CFL is finite. 
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Aa 


axbxa 


aa 


aba aab baa ababxa aabbxa 


are aS 


ababbaa abababa aabbbaa aabbaba 
Figure 9.25. Finite parse tree. 


EXAMPLE 9.11.2: For the grammar G = {V,T,P,S} where, V = {S,X}, T = {a,b} and P 
is 


S — aSb|aX 


X > bX|\a, 
We can construct the total language tree as: 


aa 
/\ rin 
aaSbb aaxb abx 
aa 


aaaSbbb aaaxbb aabxb aaab abbx aba 


. 


. . 


Figure 9.26. Infinite parse tree. 
The CFL of this CFG is infinite. 


EXAMPLE 9.11.3: For grammar G = {V,7,P,S}, V = {S, X}, T = {a}, with P as 
S— Xla 
xX > ax, 


the total language tree is: 
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Figure 9.27. Infinite parse tree. 


The tree is infinite, but CFL = {a}. 


9.12 Noam Chomsky’s Classification of Grammars 


Grammars are categorised by the productions that define them. The four types of grammars, 
due to Noam Chomsky (who developed the theory of formal languages) , are referred to as 
‘the chomsky hierarchy of grammars’. 


Let G = (V,T,P,S) be a grammar. Let A,B € V anda,a’,B € (VUT)* . Here, a, a’ 
and f’ could be the null word. 


9.12.1 Phrase-structured grammar 


w Any phrase-structured grammar (or unrestricted grammar) is of type 0. G is said to 
be of type 0, if all the productions are of the form a — £, where a € (VUT)* and 
BeEe(VUT)*. 


@ Clearly, it is seen that a cannot be €, which means that no € can be on the left side of 
any production. However, € can appear on the right-hand side of any production. 


@ A language L(G) is recursively enumerable (or type 0), if the grammar G is of type 0. 


@ The language recogniser in this case is Turing Machine. 
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EXAMPLE 9.12.1: (Type 0 grammar) 


S — aBb\e€ 
ab — bBB 
bB — ab 


9.12.2 Context-sensitive grammar 


m@ G is context-sensitive (or type 1), if every production is of the form af, where 
a,B E(VUT)T. 

@ Clearly, it is seen that a and 8 cannot be €, which means that no € appears on the left- 
and right-hand sides of any production. Thus, this grammar is €-free. 

mw A language L(G) is context-sensitive (or type 1), if the grammar G is context 
sensitive. 

@ The language recogniser in this case is linear-bounded automata. 


EXAMPLE 9.12.2: (Type-1 grammar) 


S — aBb 
aB — bBB 
bB — ab 


9.12.3 Context-free grammar 


m@ G is context-free (or type 2), if every production is of the form A — a, where 
a € (V UT)* and A is non-terminal. 

@ Inacontext-free grammar, the LHS of every production is a single non-terminal symbol 
A, which @ can replace. 


m Clearly it is seen that, a can have €, only on the right-hand side of any production. 
a A language L(G) is context-free, if the grammar G is-context-free (or type 2). 
m@ The language recogniser, in this case, is Pushdown Automata. 
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EXAMPLE 9.12.3: (Type-2 grammar) 
S — asa 
S — bsb 
S—albleé 


9.12.4 Regular grammar 


@ Gis a regular grammar (or type 3) iff, the grammar is right or left-linear. In other 
words, every production is of the from A — t or A — 1B, where t € T. That is, the 
L.H.S. of every production consists of a single non-terminal symbol A and the R.HLS. 
consists of a terminal symbol ¢ or a terminal symbol t followed by a non-terminal 
symbol B. 

m A language L(G) is regular, if the grammar G is regular (or type 3). 

m The language recogniser in this case is finite automaton. 


EXAMPLE 9.12.4: (Type-3 grammar) 


a. §S > abS b. S — Aab 
Sa A — Aab|B 
Boa 


A regular grammar is aslo context-free and a context-free grammar is also context- 
sensitive. The Venn diagram in figure 9.28 clearly shows the Chomsky hierarchy of the 
various grammars. 


Figure 9.28. Chomsky hierarchy of Grammars 
The following examples clarify the above definitions: 
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EXAMPLE 9.12.5: Let G = (V,7T,P,S), where V = {A,B,S}, T = {a,b} and P = {S > 
aA, A — bA, A —> a}. 
Here, every production is A — t or A — 1B. 


Hence, G is a regular grammar and the production A is recursive. Consequently, 
L(G) = {ab"a|n > 0} is a regular language. 


EXAMPLE 9.12.6: Let G = (V,T,P,5S), where V = {A,S}, T = {a,b}, and P = {S > 
aS, S > Aa, A > b}. 


Here, the R.HLS. of the production, S — Aa, contains the terminal symbol a on the right 
of the non-terminal symbol A. Hence, G is not regular. 


However, every production is of the form w > a, where w € V anda € (VUT)*. So, 
G is context-free and L(G) is a context-free language. 


We further discuss the chomsky hierarchy of grammar, in chapter 15, in detail. 


Avram Noam Chomsky (1928), a linguist, writer, and a political activist, was born 
in Philadelphia, as the son of a Hebrew schlor. At 10, he proofread the manuscrift 
of his father’s edition of a thirteenth century Hebrew grammar. 


On graduating from Central High School in Philadelphia in 1945, Chomsky entered 
the University of Philadelphia and received his B.A in 1949 and M.A two years later. 


Chomsky received his P.hD in linguistics from the University of Pennsylvania in 
1955 and joined the faculty at the Massachusetts Institute of Technology. 


His first book, Syntactic Structure (1957), developed from his notes for an 
introductory course in linguistics, triggered the Chomskyan revolution in linguistics 
by disputing traditional ideas about language development. Chomsky is considered 
the father of the theory of formal languages. 


In 1966, Chomsky became the Ferrari P. Ward Professor of Modern Language 
and Linguistics. He had been a visiting professor at Columbia, Princeton and the 
University of California at Los Angels and at Berkeley. 


A recipient of numerous awards and honorary degrees, includng the Kyoto prize 
in Basic Sciences in 1988, Chomsky was named one of the thousand ‘makers of the 
twentieth century’ by the London Times. 
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9.13 Relation between Regular Grammar and Finite 


Automata 


Since the language accepted by finite automata is a regular language, hence the following 
relation exists between regular grammar and finite automaton. 


a. Finite automaton from regular grammar 
For a given right-linear grammar G, there exists a language L(G), which is accepted 
by a finite automaton. 

b. Regular grammar from finite automaton 
For a given finite automaton M, there exists a right-linear grammar G such that 
L(M) = L(G). 

c. Regular expression from regular grammar 
For a given regular G, there exists a regular expression that specifies L(G). 


9.13.1 Finite automaton from regular grammar 


Let G = (V,T,P,S) be a right-linear grammar. Let V = {Vo, Vj,...} be the variables and 
productions P (of G) be: 


P={V;—> aa2,...AmV;j 
or 
Vim a1,€2,...am 
}. 


We construct a finite automaton M = (Q, X,4, qo, F), from these productions, by using the 
following steps: 


Step-1: Start symbol of grammar is the start symbol of M(go = Vo). 
Step-2: Each variable V;, of G, corresponds to a state in M. 


—-® ©®@:@ 


Step-3: For each production of the form V; —> a)a2 ...amVj, if the transitions of M 
are of the from S(V;, a)a2...@m) = Vj, add the transitions and intermediate states. 
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Step-4: For each production of the form V; > aja2...Qm if the transitions of M 


are of the form (Vj,@1a2...dm) = Vy (where Vy is the final state of M), add the 
transitions and intermediate states. 


Note-5: For any production of the form V; —€, then S(V;,€) = V; and V; is also the 
final state of M. 


EXAMPLE 9.13.1: Construct finite automaton to accept the language generated by the 
following grammar: 


S — aA|B 
A — aaB 


B-— bBla 


Solution: Let G be the right-linear grammar and M the finite automata. 


Step-1: The start symbol of G is the start symbol for M, i.e., go = {S}. 
—©® 


Step-2: Every variable V = {S,A, B}, of G, Corresponds to a state in M. 


&) 


Step-3: Compute the transitions of M for all production of the form V; > aja2...amV;j. 


a. S—->aA 
7® 


—>(s) 


—>(s) 
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b SB 


c. A— aaB 
Add transitions and intermediate states (X). 


A® 
of 


M 


d. B— bB 


Step-4: Compute for all productions of the form V; > aja2...am 
Boa 


Figure 9.29. Final M with L(M) = L(G) = aaab*a + b*a. 


Thus, M = (Q, X, 5,40, F), 


363 


Downloaded from https://www.cambridge.org/core. Stockholm University Library, on 06 Dec 2018 at 08:03:03, subject to the Cambridge Core terms of use, available at 
https://www.cambridge.org/core/terms. https://doi.org/10.1017/UP09788175968363.010 


A Textbook on Automata Theory 


where 


Q = {S,A,X,B, V;} 


[= Jab) 
gq =S 
F = Vj. 


EXAMPLE 9.13.2: Construct a FA to accept the language generated by the following 
grammar: 


S— aAleé 
A — aA|bB| € 
B— bBle 


Solution: Let G be the right-linear grammar and M the finite automaton. 


Step-1: Start symbol of G is the start symbol of M, go = {5}. 
—>©® 
Step-2: Every variable V = {S,A, B}, of G, corresponds to a state in M. 


— © @ ® 


Step-3: Compute for all productions of the form Vj > a a2... am Vj. 


a. S—>aA 
—>(6)—>@® 
b. A->aA 
ry 
c. A— DB. 
cy 
——>(s)—2-> 4656) 
d.. B— bB 
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There are productions of the form V; >€, i.e., 5 >¢,A >€ and Be. 
Therefore {S,A, B} are Vy. Thus, final M is: 


6 
y® 
: 
Figure 9.30. Final M with L(M) = L(G). 
Thus, M = (Q, 2,6, qo, F) where Q = {S,A, B}, © = {a,b}, go = {S} andF = 
{A,S, B}. 
EXAMPLE 9.13.3: Construct the automaton for the grammar 
Vo > a1Vi|a3V2 
Vi > aza4V3\a3a4ag V4 
V2 > a5V4 


V3 —> agVi\as5 


V4 —> ao 


Solution: Let G be the right-linear grammar and M the finite automaton. 


Step-1: Start symbol of G is the start symbol of M, qo = {Vo}. 
Step-2: Every variable V = {S, A, B}, of G, corresponds to a state in M. 


© ® 
—) 
© © 


Step-3: Compute for all productions of the form V; > a a2... amV;. 
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a. Vo > a,V, 


@ © 
a 
—@® 


© © 
0 © 
of 


b. Vo > a3V2 


c. Vi > aragV3 


© 


d. Vi — az3aqagV4 


O-O-+-O 


7 

J <g 

—%) NG 

a, mR 
® 2) 
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e. V2 —> asV4 


O--O-+-® 


J & 


f. V3 > agV\ 


O--@—-@ 


& %& 


© 


Step-4: Compute for all productions of the form V; > a1a2... a, as computing V3 > as 
and V4 — apo gives us the final automata shown in figure 9.31. 


ay 


(e126 
ert) iat) 


“AS, SP 


Figure 9.31. Final M with L(M) = oe, 
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Thus, M= (Q, x, 5, q0, F) where Q = {vo, V1, V2, V3, V4, X1, X2, x3}, [x _ {ao0, 41, 
a2, 43,44, 45,48, 49}, qo = {vo} and F = {vy}. 


EXAMPLE 9.13.4: Construct automaton to recognise the language generated by the 
grammar: 


S — OS|1A|1 
A — 0A|0S|O}1. 


Solution: Let G be the right-linear grammar and M the finite automaton. 


Step-1: Start symbol of G is start symbol of M qo = {5}. 
—©® 
Step-2: Every variable V = {S, A}, of G, corresponds to a state in M. 


© ©® 


Step-3: Compute all productions of the form Vj > aja2...amV;. 


a S— OS 
c) 
—-Yy @® 
b S—> 1A 
S 
—©->® 
c. A> OA 
0 0 
39+& 
d. A> OS 


Step-4: Compute all productions of the form V; > a)a2...am.Computing S > 1,A > 0 
and A — 1 gives the final automaton as: 
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— 50-46 
Oe 


Figure 9.32. Final M with L(M) = L(G). 


Thus, M = (Q,=,5,q0,F), where Q = {S,A, Vy}, & = {0,1} and go = {s}, 
F = {Vy}. 


9.13.2 Regular grammar from Finite automata. 


LetM = (Q, 2,S, go, F) bea finite automata where Q = {g0, 91, -- . dn}, is the finite number 
of states and & = {a}, a2,...a@m} is the set of input symbols. Construct a regular grammar 
G =(V,T,P,S), where V = {qo,...dn}, T = X and S = qo, by using the following steps: 


Step-1: For any transition of the form, 


@O--@ 


i.e., 6(gi,a) = qj, the corresponding production is gj — aqj. 


Step-2: For any final state of M, 


the production added is qf > €. 


EXAMPLE 9.13.5: Construct a regular grammar from the following finite automaton. 


b 
_.@ =. 


Figure 9.33. Transition diagram 
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Solution: For each transition of FA, the corresponding productions are shown below: 


qi—> qj Productions 


qo—> 11 90 > 44) 
1 ma a gba 
Mn a> an 
q2 + 93 qn —> baq3 


€ 
93-7 4@ 3 > "71 
93 = Of 93 VE 


Thus, G = (V,T, P,S) is the required grammar, where V = {go, 41, 92.93}, T = {a, 5}, 
S = {qo}, F = {q3} and 


P ={ 


90 > aq) 
gq > ba 
qi —> aq2 
q2 > bq3 
93> 41 
BE 


EXAMPLE 9.13.6: Obtain the regular grammar for the FA shown in figure 9.34. 


Figure 9.34. Transition diagram. 
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Solution: For each transition of FA, the corresponding productions are shown below: 


qi—> Gj 


Productions 


qo 4 
91 —> 90 
qo —> 92 
2 — 40 
n> 4 
a> 
3—> 43 


1 
93 —> 43 
90 = 9f 


go > 041 
qi > 140 
qo > 142 
q2 — go 
q2 > 193 
qi > 093 
q3 > 043 


qa —> 1943 
qo >€ 


Thus, G = (V,T,P,S), where V = {q0, 41,92, 93}, T = {0,1}, S = {qo}, F = {qo} and P is 


given above. 


EXAMPLE 9.13.7: Obtain the regular grammar from the FA given in figure 9.35. 


——-(4,) -2.@ =. 
a © 


Figure 9.35. Transition diagram. 


Solution: For each transition of FA, the corresponding productions are shown below: 


qi —> qj 


Productions 


0 
q0 — Go 
-€ 
qo — 41 
1 
qq 
€ 
qd— qQ 
0 
q2— qQ2 

Uf = 42. 


qo — 9q0 
qo 71 
qi > lq 
qd > gQ 


q2 > 042 
qz 7 € 
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Thus, G = (V,T,P,S) is the required grammar, where, V = {q0, 91,92}, T = {0,1},5 = 
{qo}, F = {q2} and 


P ={ 


go — Ogo 
77> 71 
qi > 1q2 
da>qQ 
q2 > 0q2 
quae. 


9.13.3 Regular Expression from Regular Grammar 


Let G = (V,T, P,S) be aright-linear grammar. A regular expression R, that specifies L(G), 
can be directly obtained as follows. 


a. Replace the ‘—’ symbols in the grammar’s productions with ‘=’ symbols to get a set 
of equations. 

b. Convert the equations of the form A = aA|b, as A = a*b. 

c. Solve the set of equations obtained, to obtain the value of the variable S (where S is the 
start symbol of the grammar). The result is the regular expression, specifying L(G). 


EXAMPLE 9.13.8: Obtain the regular expression, for the grammar given below: 


S — O1B/0 
B— 1B\11. 
Solution: 
Step-1: Replace — by = in the above productions, so that, 
S=OIBIO 2 2 ees (1) 
B=|BUl ne es (2) 


Step-2: B= 1B\l1l>B=1*11 [.A=aAlbisA =a*b] 
Step-3: Substituting for B in eq. (1), we get 
S = 01B)0 
=> S$=011*11|0 
or S=(011*11+0). 
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EXAMPLE 9.13.9: Obtain the regular expression for the grammar given below. 


S — baS|aA 
A — bbA|bb. 
Solution: 
Step-1: Replace — by = in the above productions, so that, 
S=baS|a@A 2 2 2 2 sever (1) 
A=bbA|bb, trees (2) 
Step-2: A = bbA|bb > A = (bb)*bb. (A =aA|bisA = a*b). 
Step-3: Substituting for A in eqn (1), yields 
S = baS|aA 
S = baS|(a(bb)*bb) 
=> S = (ba)*(a(bb)*bb). 


EXAMPLE 9.13.10: Obtain the regular expression for the grammar given below. 


S + OA|O}1B 
A= IAll 
B— OB|iS 


Solution: 


Step-1: Replace — by = in the above production, so that, 


S=OAJO}IB tee es (1) 
7. (2) 
B=OBIIS ne ees (3) 


Step-2: 


A=1Aj\li>A=1*1 
B=O0B|1S > B=0*1S 
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Step-3: Substituting for A and B in eq. (1), we get 


S =0A|O|1B 

S =01*1|0|10*15 

S = (10*1)S|(01*1|0) 

S = (10*1)*(01*1]0) ¢.: A = aAlb is A = a*b) 
or S = (10*1)*- (01*1 +0). 


9.14 Exercises 


Compare regular languages and context-free languages. 

Define grammar. Explain the structure for writing a grammar. 

Define production rule for grammar. Explain its different forms. 

What is a context-free grammar? Explain with examples. 

Explain different types of grammar with examples. 

Draw a CFG on {a, b}, to generate a language L = {a"wwkb"|w € D*,n > 1}. 
Obtain a CFG to generate L = {w|w € {a, b}*,ng(w) > np(w)}. 

Obtain a grammar to generate the language L = {w : |w| mod 3> 0} on & = {a}. 


Obtain a grammar to generate the set of all strings, with no more than three a’s on 
x = {a,b}. 


SO COON Se Ne 


10. Obtain a grammar to generate the language L = {a'b/c*|i + 27 = k, i > 0,j > 0}. 
11. Present the formal definition of derivation tree for G. 
12. Explain with examples, the leftmost and rightmost derivation. 
13. For the grammar G = (V,T,P,S), where V = {s}, T = {a, b} and 
P={ 
S — aSa|bSb| € 
} 
construct a leftmost, rightmost derivation and parse tree. 
a. daaaaa 
b. abbbba 
c. bababbabab 
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14, What is meant by ambiguous and unambiguous grammars? 


15. Show that the following grammars are ambiguous: 


a. S > aS|X 
X—> aXx\a 

b. S — aSbS 
S — bSaS| € 


16. What is a regular grammar? Explain the relation between regular grammar and finite 
automaton. 


17. Construct finite automaton to accept the language, generaied by the following 
grammars: 


a. S > aS|X 
X—> axa 

b. S — AB\|aaB 
A — alAa 
Bb 

c. S— SS|dS|Sd\c 


18. Construct a regular grammar, from the following finite automaton: 
a. 


—po ) 


ad 
b 


a,b 


Figure 9.36. Transition diagram 
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OO 
—- © @O- -@ 
See 


Figure 9.37. Transition diagram 


19. Obtain regular expressions from the following grammar. 


a. S — AB\aaB 
A — aAla 
B->b 

b. S — Xaax 
X — aX|bxX|€ 


20. Write a procedure to transform a left-linear grammar to right-linear grammar. 
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Simplification of Context-Free 
Grammar 


‘The supreme excellence is simplicity.’ 


Introduction 


In this chapter, the important and unimportant symbols in a CFG are discussed. Methods to 
eliminate the unwanted productions are also illustrated. This procedure leads to the reduction 
of a CFG into a more compact form. The salient features of the three normal forms, due to 
Chomsky, Greibach and Kuroda, are discussed and well-examined problems are included, 
to make the concepts clear. 


10.1 Simplification of Context-Free Grammars 


In a CFG, it is not necessary to use all the symbols in V or all the productions in P, for 
deriving sentences. So, we can reduce the complexity of the grammar, without reducing the 
generating power of CFG. This can be done by the following procedures: 


mw By eliminating the useless symbols. 
m@ By eliminating the unit productions. 
m= By eliminating the null productions, if € is not included in the sentence. 


10.1.1 Useful and useless symbols in CFG 
A grammar symbol X in CFG in useful, if and only if: 


a. it derives a string of terminals, and 
b. it is used in derivation of atleast one w in L(CFG). 


Formally, ina CFG, G = (V,T,P,S), X is useful if and only if: 


4, £3 w, where w is in 7*. 
b. S—>aXB—>w, in L(G). 
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A grammar symbol X in CFG is useless if: 


a. it does not derive a string of terminals, and 
b. it does not occur in a derivation sequance of any w in L(CFG). 


Formally, ina CFG, G = (V,T, P, S), X is useless, if it does not satisfy either of the 
following conditions: 


a. X —> w, where w is in T* or 
b. S—>aXB—> w, in L(G). 


Thus to find the useless and useful symbols in a grammar, we need to: 


a. identify the non-terminals which are derived from terminal strings, and 
b. identify variables, which are reachable from start state. 


EXAMPLE 10.1.1: Identify the useful and useless symbols in G = (V,T,P,S), given: 
V = {(S,A, B}, T = {a} and, P is 


S — alJAB and A—> a. 


Solution: 


@ To find the variables which are derived from the terminal string: S and A are derived 
from the terminal strings, 


ie,S—-a i and Aa. 
B is not derived from the terminal string. 


@ To find the variables, which are reachable in derivation, from the start state: A is not 
reachable in derivation from start state. 


a 
O2-O) 
Thus, S is useful and A, B are useless symbols in G. 


=> G= ({s}, {a}, S > a,S) 


378 


Downloaded from https://www.cambridge.org/core. Stockholm University Library, on 06 Dec 2018 at 08:02:55, subject to the Cambridge Core terms of use, available at 
https://www.cambridge.org/core/terms. https://doi.org/10.1017/UPO97881 75968363.011 


Simplification of Context-Free Grammar 


EXAMPLE 10.1.2: Identify the useful and useless symbols in G = (V, T, P, S), given: 
V ={S,A,B,C}, T = {a,b) and P is 
S — aS|A|C 
A->a 
B- aa 


C — aCb. 


Solution: 


m To find the variables which are derived from the terminal string: S$, A and B can be 
derived from the terminal string, 


ie.,S > aS|A, A> a, B— aa. 


gm To find the variables, which are reachable in derivation from the start-state: B is not 
reachable in derivation from the start state. 
Thus, S,A are useful and B, C are useless symbols in G 


=> G= ({S,A}, {a}, {S ~ as|A, A > a},S8). 


EXAMPLE 10.1.3: Identify the useful and useless symbols in G = (V,T, P,S) given: 
V = {S,A, B,C}, T = {a,b} and P is 
S — AB|CA 
B-—> BC|AB 
A = a 
C => aBlb. 


Solution: 


@ To find the variables which can be derived from the terminal string: S,A and C can be 
derived from the terminal strings, 


ie.,S > CA, A> a, C—> b. 


@ To find the variables which are reachable from the start state: S, C, A — all are reachable 
from the start state. 
(.. ©—-©— @)) 
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Thus, S,C,A are useful and B is a useless symbols in G. 
=> G= ({S,C,A}, {a,b}, {S ~ CA,A > a,C > b},S) 


10.2 Reduction of a Grammar 


Reduction of a grammar refers to the identification of (i) those grammar symbols, which 
are useless and (ii) those productions, that do not play any role in the derivation of any w in 
L(G). 

Thus, the reduction of a given grammar G, involves: 


a. identification of those grammar symbols, that are not capable of deriving a w in T* 
b. identification of those grammar symbols that are not used in any derivation 
c. elimination of the above identified symbols. 


10.2.1 Reduction Algorithm 


Given a grammar G = {V,T, P, S}, obtaining the reduced form of Gi.e., G’ = (V’,T, P’,S) 
involves the following steps: 


: Set V’ = {do}. 

: Find the variables, which derive the terminal string, and add them to V’. 

: Repeat the following procedure, until no more variables are added 
to V’. For every A € V, for which P has a_ production 


of the form A —-  x},X%0,...X, with all x; in (V’ U T), 
add A to V’ (ie., find the variables reachable from the start 
state). 

: Take P’ as all productions in P, all of whose symbols are in (V’ U T). 


EXAMPLE 10.2.1: Find the reduced grammar equivalent to the CFG, where V = 
{A, B,C, S}, T = {a,b}, P is 


S — AB|CA 
B— BC|AB 
A->a 


C > aBlb. 
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Solution: 


mw Set V’ = {d}. 


m Find the variables which derive the terminal strings: $,A and C derive the terminal 
strings > V’ = {S,A, C}. 


@ Find the variables which are reachable from the start state. §,C,A — all are reachable 


from the start state. 
=> V’ = {S,A,C} 


gw P={S—>CA,A—>a,C > 5d} 
Thus, G’ = ({S,A, C}, T, {S > CA, A > a, C > b}, S). 
EXAMPLE 10.2.2: Find the reduced grammar, equivalent to the CFG, 
G = ({S,A, B, C}, {a, b, d}, S, P) 
where P is S — AC|SB 


A — bASC\a 
B — aSB|bbC 
Solution: dc cd 
mw Set V’ = {¢} 


m Find the variables which derive terminal strings: S, A, C derive terminal strings 
=> V' = {S,A,C}. 
m Find the variables, which are reachable from the start state. §,A, C — all are reachable 


from the start state. 
=> V’ = {S,A, C}. 


m@ P’ ={S > AC,A => DASClIa, C > ad}. 
Thus, G’ = ({S,A, C}, T, {S ~ AC, A > bASC|a, C > ad}, S). 
EXAMPLE 10.2.3: Find the reduced grammar equivalent to CFG. V = {S,B,A,X}, 
T = {a,b,q,d} and P contains 

S — aBlbx 

A — BAd|bSX\q 

B— aSB|bBX 

X — SBD|aBX|ad. 
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Solution: 


w Set V’ = {d}. 

m Find the variables which derive terminal strings: A > a,X — ad,S ~ bx > V' = 
{S,A, X}. 

gw Find the variables, which are reachable from the start state. S and X are reachable from 
the start state > V’ = {S, X}. 

m P’ ={S > bxX,X > ad}. 


Thus, G’ = ({S,X}, T, {S > bX,X — ad},S). 


EXAMPLE 10.2.4: Find the reduced grammar for the given grammar G, where P is 
S — aAa 
A — bBB 
B—- ab 

Solution: C — ab. 


m Set V’ = {¢}. 
m Find variables which derive the terminal string: 


B = ab,C > ab=> V' = {B,C}. 
@ Find the variables, which are reachable from start state. 
S — aAa — abBBa -—> abababa 


i.e. 5, A, B are reachable from start state. 
(-.° Q)—-®—®) 


= Vv! ={S,A,B}. 
mw P’ ={S > aAa,A > bBB,B => ab}. 
Thus, G’ = ({S,A, B}, 7, {S — aAa,A — bBB,B — ab}). 


EXAMPLE 10.2.5: Find the reduced grammar equivalent to the given grammar, whose P is 
A — xyz|Xyzz 
X — Xz|xYx 
Y > yYy|XZ 
Z => Zylz. 
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Solution: 


w Set V’ = {9}. 
g Find the variable which derive terminal strings: 


A> xyz,Z>2z=>V' ={A,Z}. 


@ Find the variables, reachable from the start state. Z is not reachable from the start state 
=> V! = {A}. 
gm P’={A => xyz}. 


Thus, G’ = ({A}, T, A > xyz, A). 


EXAMPLE 10.2.6: Find the useless symbols in the following grammar and modify the 
grammar, so that it has no useless symbols: 


S— OlA 
A— AB 
Bo 1 
Solution: 
mw Set V’ = {9}. 


@ Find the variables which derive terminal strings: 
S>0,B>1=5V'={S,B}. 


w Find the variables, which are reachable from the start state: B is not reachable from the 
start state 


a P’={S => 0}. 
Thus, G = ({S}, T, {S > 0}, S). 


10.3 Elimination of ¢-Productions 


A production of the form A —€ is called an € production. If A is a non-terminal, and 


A—> € (ie if A leads to an empty string in zero, one or more derivations), then A is called 
a ‘Nullable non-terminal’ . 
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EXAMPLE 10.3.1: For the grammar: 
S-> aS;b 
Sy _> aS b| €, 


the €-production is: §} > €é. 


EXAMPLE 10.3.2: For the grammar: 
S — ACB|cbB|Ba 
A — da|BC 
B-— gCle 
C > hale 


It is clear that B > € and C —€ are €-productions. 
Consider, 
A= BC 
=>Be 
>eE 


i.e. A= €, hence A —-e. Thus, A is an €-production. 
Again, 
S = ACB 
= AC 
=>A 
=>eE. 


i.e. S==> &, hence S + € is also an €-production. 
..,B, C,A, S are all nullable. 


Theorem I: If G = (V,T,P,S) is a CFG, then there exists a CFG G’, having no 
null productions such that L(G’) = L(G) — {€}. 


Proof: This theorem can be proved (in a constructive way) as follows: 


Step-1: Construction of a set of nullable variables. 
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The nullable variables can be computed, recursively, as follows: 


a Wi = {Ae V’'JA >eEisinP }. 


b. Wi41 = WU {AJA > @ with a € W;*}. 


By the definition of W;, 
W; © Wy+1 for all i. 


As V’ is finite, 


Wisi = W, for some k < |V’|. 


So Wy4; = Wg, for all j. Let W = Wy. 


Therefore, W is the set of all nullable variables. 
Step-2: Construction of P’ 


a. Any production, whose R.H.S. has any nullable variables should be 
included in P. 

b. If A — X1,X2,....X,% is in P, then the productions of the form 
A > 10203... .d, are included in P’, where a = Xj, if X; ¢ W (ie., 
if X; is not nullable) and a; = X; or € (if X € Wanday,a2,....a% #6E). 


Since the nullable variables are eliminated, it is clear that G’ does not contain any 
null productions. Hence proved. 


10.3.1 Method to eliminate €-production 


Step-1: Identify the null productions and nullable variables. Compute the 
nullable variable by using recursive definition as: 


a. Wy = {Ae V'JA S€ is in P}. 
b. Wi41 = W U {AIA > a, witha € W7"}. 


Step-2: Construction of P’ 


a. Any production whose R.H.S. does not have any nullable variables is 
included in P’. 
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b. Ifthere is a production say A — €, then to eliminate €-production, look 
for all those productions in the grammar whose right-sides contain A 
and replace each occurrence of A in these productions. Thus, obtain 


the non-€-productions to be added to the grammar, so that language’s 
generation remains the same. 


EXAMPLE 10.3.3: Construct a grammar G’, without null-productions for the given grammar 
G, whose productions are: 
S — aS|AB 


A-eéE 
Boe 
D=— b. 
Solution: 
Step-1: Construction of nullable set 
W, = {A|A > € is in P} 
W, = {A,B} (.:A—7e,B—-6) 
Wi41 = Wi U{A|A >a with a € W;} 
W2 = {A,B} U{S} (. S > AB > 6} 
W, = {S,A, B} 
W;3 = {S,A, B} Ud => W;3 = {S,A, B}. 
Thus, W = {S,A, B}, is the set of all nullable variables. 


Step-2: Construction of P’ 


a. R.H.S with no nullable variables, is D — b. 


b. S — aS generatesS > aSandS—~a_ (..S— 6). 
S — AB generates S > AB,S > AandS>B (¢.A>€,B->€6). 


Thus, P’ = {S — aS|a|AB|A|B, D > b}. Hence, G’ is without €-production. 
EXAMPLE 10.3.4: Find a CFG, without €-productions, equivalent to the following grammar 


defined by: 
S — ABaC, A— BC, B- bl €, C—> D| € andD — d. 
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Solution: 

Step-1: Construction of nullable set 
W, = {AJA > € is in P} 
W, = {B,C} (¢. Boe,C 6). 

Wi+1 = Wi U {A|A > a witha € W,*}. 
W2 = {B,C} U {A} (..A => BC >€6) 
W, = {A,B,C}. 
Similarly, W3 = {A,B,C} U¢@ => W3 = {A,B,C}. 


Thus, W = {A,B,C}. 
Step-2: Construction of P’. 


a. R.H.S. with no nullable variables are B > b, C > D,D —> d. 
b. S — ABaC generates 
S— ABaC, S-— ABa,S — AaC,S — BaC 
S — aC,S — Ba,S —> Aa. 


A — BC generates A > BC,A > B,A—> C 
Thus, 
P’ = {S > ABaC|ABa|AaC|BaC|aC|Ba|Aa, 
A — BC|B|C 
Bb 
Cc — D, 
D— d}. 
Thus, G’ is without €-productions. 
EXAMPLE 10.3.5: Consider the following grammar and eliminate all ¢-productions, 
without changing the language generated by the grammar. 
S — AaA 
A — Sb|bCC| € 
C — CC\abb 
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Solution: 
Step-1: Construction of nullable set. 
W, = {A|A > € isin P}. 
Wi = {A} (Ae). 
Wi+1 = W; U {A|A > a witha € W;}. 
W2 = {A} Ud => W2 = {A}. 


Thus, W = {A}. 
Step-2: Construction of P’. 


a. R.H.S. with no nullable variables is, C — CC|abb. 
b. S — AaA generates 


S — AaA,S — Aa,S — aA,S —> a. 
Thus, 
P’ = {S > AaAl|Aa|aA|a 


A — Sb|bCC 
C — CC|abb.}. 


Thus, G’ is without €-productions. 


10.4 Elimination of Unit Productions 


The following are the equivalent definitions for the concept of unit production: 
m@ Any production of the form A — B, whose R.H.S. consists of a single variable, is called 
a unit production. All other productions including A — a, € are non-unit productions. 


m@ A production of the form A — B, where A and B are both non-terminals, is called a 
unit production. 


Note-1: 


Presence of unit production in a grammar increases the cost of 
derivations. 
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EXAMPLE 10.4.1: Consider a grammar 
S— AB 
A->a 
B->C\b 
CcC—>D 
D-E 


E> a. 
Here, 


® unit productions —B > C 
C—-D 
D- E, 
@ non-unit productions —S — AB 
A>a 
E> a. 


10.4.1 Method to eliminate unit productions 


Step-1: Identify the unit productions in the grammar. 
Step-2: Construction of P’ 
a. Identify all non-unit productions and add to P’. 


b. If there exists a unit production A — B in the grammar, then apply the 
following steps until there are no unit productions left. 


m@ Select a unit production A — B, such that, there exists atleast one 


non-unit production 
Boa 


= Nov, for every non-unit production B — a, add production A > a 
to the grammar and eliminate A — B from the grammar. 


EXAMPLE 10.4.2: Eliminate all unit productions from the grammar given below: 
S — Aa|B 
B— A\bb 
A— albc|B 
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Solution: 


Step-1: Identify all unit productions 
a S— B. 
b. BoA. 
c. A>B. 

Step-2: Construction of P’ 


a. Toeliminate $ > B 
Substitute the alternative of B to S — B, i.e., 


S — Aa|B => S — AalA\bb. 


But S — A is a unit production. 
Substitute alternative of A to S — AalA|bb 


S — AalA|bb => S > Aala|bc\|bb 


b. To eliminate B — A 
Substitute the alternative of A to B — A, so that 


B — Albb => B — albc\bb. 


c. Toeliminate A > B 
Substitute alternative of B to A — B, so that 


A — albc|B => A — albc\bb. 
Thus, P’ is 

S — Aala|bc|bb 

B — albc|bb 

A — albc|bb. 

EXAMPLE 10.4.3: Given the grammar below, eliminate all the unit productions. 

S — AB 

A->a 

B>C\b 

c-D 

D-E 


E-a 
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Solution: 


Step-1: Identify all unit productions 


a BoC 
b CD 
c DoE 


Step-2: Construction of P’ 
P’ ={S > AB, A> a, E> a} 


a. To eliminate B > C 
Substitute alternative of C to B > C, so that 


B>C|lb>B- D\b. 
Since D is a unit production, substitute alternative of D to B > D|b. 
B> D\jb>B- Ejb. 


Since € is unit production, substitute alternative of E to B > E|b. 


BB 
b. To eliminate C — D 


Substitute alternative of D to C — D, so that 
C-D>C-E 


Since E is a unit production, substitute alternative of E to C > E. 
c+Es 
c. Toeliminate D> E 


Substitute alternative of E to D —> E, so that 


psesl(p sel 


Thus, the modified P: 


B—>alb 


C—-a,D— aandE — a. 
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EXAMPLE 10.4.4: Eliminate all unit productions from the following grammar: 
S — alaA|BiC 
A— aBle 
B- Aa 
C—>cCD 
D => ddd. 


Solution: The given grammar is with €-productions. After the elimination of € from the 


above grammar, P’: 
S — alaA|Blcla 


A—>aB 
B— Aala 
C—>cCD 
D — ddd. 
Step-1: Identify all unit productions 
a SB 
b. SOC. 


Step-2: Construct P” 
P" = {A — aB,B — Aala,C > cCD,D — ddd} 


a. To eliminate S — B 
Substitute alternative of B to S — B, so that 


S — alaA|B|C\a => S — alaAlAala|Cla. 


b. To eliminate § > C 
Substitute alternative of C to S — C, so that § > alaA|Aala|C|a 


=> § — alaA|Aa|a|cCD\a. 
Finally, P” is: S — alaA|Aa|a|cCD\a 
A— aB 
B— Aa|a 
C—>cCD : 
D = ddd. 
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10.5 Normal Forms for Context-Free Grammars 


When the productions in CFG are made to satisfy certain restrictions, then CFG is said to 
be in normal form. 


The following are some types of normal forms. 


a. Chomsky normal form 
b. Greibach normal form 
c. Kuroda normal Form 


10.6 Chomsky Normal Form 


Chomsky normal form (CNF) puts restrictions on number of symbols on the right of a 
production. In other words the strings, on the right of a production, consist of not more than 
two symbols. 


Formally, a CFG is said to be in chomsky normal form, if all productions are of 
the form 


A—> BC 


or, A> 4, 


where A, B, C are non-terminals and a is a terminal. 


EXAMPLE 10.6.1: The grammar 


S — BS\a 
B— SB\b 
is in CNF. 
EXAMPLE 10.6.2: The grammar 
S — BS|BSB 
B— SBlaa 
is not in CNF, because both the productions S — BSB and B — aa violate the conditions 


of CNF. 
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EXAMPLE 10.6.3: The grammar 


S — AB|CA 
B— BC\|AB 
A->a 

C — aBlb 


is not in CNF, because C — aB violates the conditions. 
10.6.1 Reductions of CFG to CNF 


Theorem II: For every CFL, without € generated by a grammar Gy, there is an 
equivalent grammar G2 in CNF. 

Proof: Let G; = (V,T,P,S) be the CFG generating a language not containing €. 
We construct G2 = (V”",7T, P”, S) as follows: 


Step-1: Eliminate €-productions and unit productions 


Eliminate all € and unit productions from G;, by using an appropriate 
method. If a production contains a single terminal symbol at right, it is in 
acceptable form. Add all such productions to P’ and all variables to V’. 


Eliminate terminals on R.H.S. 


All productions in P of the form A — BC or A — a are included in 
P’ and all variables of such productions are included in V’. Consider a 
production of the form A > x1x2...X,, n > 2. If x; is a terminal ‘a’, 
introduce a new variable ‘7,’ and replace the terminal x; by T,. These 
new productions are added to P’ and new variables are added to V’. 


Restrict the number of variables on RHS 


All productions, that are already in acceptable form of P’, are added to P” 
and all respective variables are included in V”. Consider the production 
A — AjA2...An,n = 3. Now, introduce new productions A — A,Kj. 
K, — A2K2...Kn-2 — Am—1Am to P” and respective new variables 
K\, K2,... Kn—2 are added to V”. 


Thus, we get G2 in CNF. 
Hence proved. 
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EXAMPLE 10.6.4: Reduce the grammar into CNF. 


S — aAD 
A — aB|bAB 
Bob 

D— d. 


Solution: 


Step-1: Eliminate null and unit productions. There are no such productions. 
Step-2: P’ = {B > b, D > d} is in CNF. 
Reduce the number of terminals in 


S — aAD 
A — aB|bAB. 


Thus, 


S— aAD=>S-—T,AD, where Tz —> a 
A—-aB=>A->T,B, where T,—- a 
A— bAB=>A->T,AB, where T;, — b. 


Step-3: Restrict number of variables on the R.H.S. (for those productions which are not 


in CNF). 
Thus, 
S > T,AD =>S—>T,Ki, where K,; ~ AD 
A>T,B=>A-—>T,B, already in CNF 
A— T,AB=>A— T,K2, where K2 — AB. 
Thus, 
P” ={S > T,K 
A-—> T,B 
A—>T7pK2, K,;— AD, K2— AB 
Bb 
D-d, Tg-a, T,—- bd}. 
V” = {S,A,B, D, K,, K2, Ta, To}. 
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EXAMPLE 10.6.5: For the grammar given below, find an equivalent grammar in CNF. 


S — bA\aB 
A — bAA|aS\a 
B — aBB\|bS|b 
Solution: 
Step-1: There are no null and unit productions. 
Step-2: P’ = {A > a,B > b} is in CNF. 
Reduce the number of terminals in 
S — bA\aB 
A — bDAA|aS 
B— aBB\bS. 


Thus, 


S— bA=>S-—T,A, whereT, > b. 
S—>aB=>S-—T,B, whereT, - a 
A— bAA=>A-— T,AA, where Ty > b 
A->aS=>A-—-T,S, whereT, — a. 
B— aBB=>B-—T,BB, whereT, > a 
B->bS=>B-—-T,S, whereTp — b. 
Step-3: P’ = {A > a,B > b,S > T)A,S > T,B,A > T,S,B — T)S} are now in 


CNF. 
Restrict the number of variables for production: 


A — TyAA B-> T,BB. 
So that, 


A — T,AA => A — T;,K,, whereK, — AA. 
B— T,BB => B — T,K2, whereK2 — BB. 
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Thus, 3 
V = ({S,A,B, Tq, Tp, Ki, K2}. 


P” =S > T,A|T,B 
A — 17)K,|T,S\a 
B- 1T,K2|T,S\|b 
K, — AA 
K> — BB,T, — a and Ty > b. 


EXAMPLE 10.6.6: Convert the given grammar to CNF. 
S — AB|CA 
B-> BC\AB 
A->a 
C — aB\b 


Solution: 


Step-1: Eliminate € and unit productions. There are no such productions. 
Step-2: P’ = {S — AB|CA,B — BC|AB,C — b,A — a} are in CNF. Reduce the 
number of terminals in C — aB. 
C > aB=>C—> T,B whereT, > a. 
Thus, 
P” =§ — AB|CA 
B-— BC|AB 
C — T,B\b 
A->a 
Tg > b 
Vv” = {S,B,C,A, Tq}. 
EXAMPLE 10.6.7: Write the equivalent CNF, for grammar: 
S — AB 


A — aab 


B- aAC. 
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Solution: 
Step-1: There are no unit and null productions. 


Step-2: P’ = {S — AB} is in CNF. 
Reduce the number of terminals for A — aab, B — aAC. 


Tg >a 


A — aab => A — TgTaTp where > b° 


B—- aAC>B-T,AC. 


A — TaTaTp 


B->T,A, 4 


Step-3: Restrict number of variables on the R.H.S. for 
A — TgTaTp => A TaK,; where ki > TaTp 
B— T,AC > B->T,K2 where kz — AC. 


Thus, 

P” ={S > AB 
A— T,K, 
B-> T,K2 
Ki — TaTp 
ky — AC 
Tg >a 
Tp — b} 

Vv" = {S,A, B, K, K2, Ta, Tp}. 


EXAMPLE 10.6.8: Reduce the grammar, into CNF. 


S — AOB 
A — AA|0S|O 
B— OBB|1S|1 


Solution: 
Step-1: There are no unit and null productions. 
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Step-2: P’ = {A > AA,A > 0,B > 1}. is in CNF. 
Reduce the number of terminals for S —- AOB 


A— 0S 

B— OBB\I1S 

5 — AOB=>S-— ATpB where Tp —- 0 
A->0S=>A—>T )S where Jp —-0 
B—>O0OBB=>B-—>ToBB where Tp —>0 
Bo>1S3B—-T7\S_ where 7; —-1. 


Step-3: P’ = {A — AA,A > 0,A > T)S,B > T,S,B — 1} is in CNF 
Restrict the number of variables on the R.H.S. for S — ATpB and B > ToBB. 


S — ATopB => S— AK, where k; > ToB 
B— ToBB => B — ToK2 where kz — BB. 


Thus, 

P” ={S > AK, 
A — AA|ToS|0 
B > ToK2|T|S|1 
Ki > ToB 
ky — BB 
Tyo > 0 
T; > 1} 

V” = {S,A,B, Ki, K2,To, 71}. 


10.7 Greibach Normal Form 


Greibach normal form (GNF) puts restrictions not on the length of R.H.S. of a production, 
but on the positions in which terminals and variables appear. Arguments justifying GNF are 
little complicated and not very transparent. 
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Formally, a CFG is said to be in Greibach Normal Form, if all productions are of 
the form 


A ax, 


where, a is a terminal i.e, a € T and x is a string of variables (possibly empty), 
xev™. 


EXAMPLE 10.7.1: The grammar 


S— AB 
A — aA|bB\b 
Bob 


is not in GNF because S — AB is not in the form A —> ax. 


EXAMPLE 10.7.2: The grammar 


S > aAB\|bBB|bB 


A — aA|bB\b 
Bb 
is in GNF. 
EXAMPLE 10.7.3: The grammar 
S — aSB\aB 
Bob 


is in GNF. 


Sheila Greibach was born in New York City in 1939. She attended Radcliffe 
College, where she received her BA in 1960. In 1963, she received her Ph.D. from 
Harvard University. 

Greibach is well known for her work on formal languages. One of her 
contributions is Greibach normal form, a normal form for grammars in which every 
production is of the form A —> aB or A > a. 
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10.7.1 Reduction of CFG to Greibach Normal Form 


Theorem III: For every CFG, G with e¢ L(G), there exists an equivalent grammar 
G’ in Greibach Normal Form. 
Proof and algorithm: 
Outline of the proof: 
Firstly, we convert the given grammar into Chomsky NF: 
Start with a grammar G = (V, T, P, S). 
Eliminate useless variables that cannot become terminals. 
Eliminate useless variables that cannot be reached. 
Eliminate €-productions. 
Eliminate unit productions. 
Convert grammar to Chomsky Normal Form. 


Then, we convert this grammar into an equivalent grammar in Greibach NF. The 
core of this procedure to construct a grammar in GNF is the following: 

We sort our productions, so that we get a sequential line of dependencies for those 
variables, which come first on the R.H.S. of the productions. Every derivation 
will use variables in accordance with this sequential ordering. The last variable 


in the sequence of productions or variables has only terminal rules. Thus, every 
derivation will end (latest) with this variable. We use this variable to start a step-by- 
step substitution of the first R.H.S variable in the other rules, so that we successively 
get all rules, starting with a terminal. 

This way, we transform the grammar, starting with a CNF, into GNF. 

Steps to Convert a CNF grammar into Greibach Normal Form: 


a. Relabel all variables such that the names are A, A2, ..., Am. 
b. We want to order the productions, which are not terminal but contain 
variables. For this purpose, the indexing of the variables is used, so that 


Aj > Aja withi <j, foralli=1,...,mandj =2,...,m. 


We perform the ordering process by the substitution of the first variable on 
the R.HLS., if the production violates the condition above (see c and d). 
During this process we also eliminate left-recursive rules, i.e., rules of the 
form 

Aj —_> Aja 


as soon as we encounter them in the ordering process (see e). 
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The outcome of this phase is that all rules are sorted and have either a 
higher-numbered variable as first symbol on the R.H.S., or a single terminal. 


We start the ordering with A). 


A, - productions can have only a higher-numbered variable as first variable 
on the R.H.S., or a single terminal. (For left-recursive rules Aj > Aj... see 


e). 


We, now, assume that all rules are okay up to Ag_;. The next rule we 
encounter, with Ay on the LHS, is the first one, which is not okay: 


Arp — Aja withk > 1. 


We resolve this problem by substituting A;. Since / < k, the A;-rules have 
already gone through the sorting process and are in the proper format: 


A; — Aja with! <j. 
Now, we substitute the RHSs of A; in the A, - rule, and come up with: 
Ax —> Aja or At-a for some a. 


If / is still less than k, we substitute again. This process is repeated until we 
get atleast Ay on the R.HLS. All rules up to Ay_) are already sorted and in 
proper form, and the first variable on the R.H.S. of the Ay_; - production 
must, therefore, be atleast A;. 


When we encounter a left-recursive rule during this process, we resolve this 
left-recursion immediately. 
Let’s assume that we have a set of left-recursive rules for Az 


L.A — Agar |Agar|... |Agey 
and a set of not left-recursive rules: 


2.Ax —> By|B2|.--|Bn- 


We convert the left-recursion over A, into a right-recursion, using a new 
variable B;, and new productions: 


3. Br > ay |ayBylaz|a2B;|... |a-l|a,By. 
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We integrate this with the non left-recursive rules for Ax: 


4.Ax — Bi|B1Bx|B2|B2Bx\ ... |Bn|BnBx. 
The rules 3 and 4 replace the original rules 1 and 2. 


When you look at rule 4, you can see that it generates the original R.H.Ss 
6; from the non left-recursive rules for Az. It also includes the recursion 
through the variable B, but starting ‘at the end’, i.e. using the string 8; first, 
and then the recursive variable B,. By, can now be substituted with just one 
of the a;, or the recursion can generate more of those a;. Thus, the new 
productions 3 and 4 are generating the same strings as the old productions 
1 and 2. 


. When we are done with this, we find that all the rules are in the proper 
format (according to b). They either start with a higher numbered variable, 
or a terminal (from the original rules, directly or through substitution). 


In particular, the rules for the highest-numbered variable have to start 
with a terminal (since no variable has a higher index than this one). 


Now, we use this ordering and again perform a set of substitutions, to ensure 
that all productions start with a single terminal. 


We start these substitutions from ‘the end’. We know that rules for 
Am Must start with a single terminal. All other rules might start with a 
terminal, possibly followed by a higher-numbered variable, and then the 
other variables. 


We start the substitutions with the rules for the second highest variable 
Am-—1. If the R.H.S. of an Aj»,—1-production starts with a variable, it must be 
Am. We substitute this A, with its R.H.S. and get a new rule for A,,_1, which 
starts with a single terminal. We proceed backwards doing this, through all 
A; down to Aj. Now all A; rules start with a single terminal symbol, followed 
by nothing or variables only. 


In the last step, we bring the B-rules (introduced for the removal of left- 
recursion) into proper format. If the R.H.S. of a B-rule starts with a variable, 
we just need to replace this variable with its R.H.S. (which starts with a 
terminal). Then, B; rules also conform with GNF. 


i. The conversion of CNF grammar into GNF is complete. 
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EXAMPLE 10.7.4: Bring the grammar G with V = {S,A, B}, T = {a,b} and productions p 


SoA 
A — aBala 
B — bAb|b 
into GNF. 
Solution: 


Step-1: Simplify G 


There are no useless variables or productions and no €-productions. Remove unit- 
production S — A. Replace A with RHS of A (after calculating transitive closure 
of unit-productions - but there is only one unit-dependency here, i.e., A => B). 


S — aBala (new rule) 


Step-2: Transform G into an equivalent grammer G’ in Chomsky NF 
a. Substitute terminals on RHS with variables: 
S—> C\BC\la Cj; >a 
A> C\BC\la C2>b 
B— C2AC2|b. 


b. Break down the rules: 


S-> CiD,\a D, => BC, Ci —-a 
A> C,D,\a 
B—-> C2D2|\b D2 > AC. Cr > bd. 


Step-3: Transform G’ into equivalent grammar G' in Greibach NF 
Rename the variables to V;, V2, ... in the productions p’. 


S=V\;A=V2;B = V3; Ci = V4; C2 = V5;D, = Vo; D2 = V7 


Vi > VaVela Ve — V3V4 Vara 
Vo > VaVola 
V3 > V5V7|b V7 > V2V5 V5 —> bd 
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Step-4: Order the productions and remove recursion where necessary 
Vi > V4Vela 
Vo — V4Vela 
V3 > V5V7|b 
V4->a 
V5 > b 
V6 > V3V4 
V7 —> Vo2V5 
Rules for V), V2, V3, V4, V5 are fine, with respect to ordering. 
a. Modify Ve — ruleV6 > V3V4 
w Substitute R.H.Ss of V3 in V6-rule: 
Ve > V5sV7V4lbV4 
m Substitute R.H.Ss of Vs in the modified V¢-rule: 
V6 — bV7V4\b V4 _final Vo — rule 
b. Modify V7-rule V7 > V2V5 
w Substitute R.H.Ss of V2 in V7-rule: 
V7 > V4Ve6V5laVs 
gs Substitute R.H.Ss of V4 in the modified V7-rule: 
V7 > a VoVsla Vs _final V7 — rule 


Now all the rules are sorted properly, according to the ordering constraint: if 
V, ~ Vj... theni <j. 


The new productions are: 

Vi > V4Vola 

V2 > VaVela 

V3 > V5V7|b 
V4-7a 

V5 —>b 

V6 > bV7V4|bV4 
V7 > aV6VslaVs 
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Step-5: Substitute to achieve Greibach Normal Form 
We have to substitute backwards, all leading variables on the R.HLS., (i.e. in all 
those rules, which still start with a variable). 
These are the V3—, V2—, and Vj-rules: 
V3 — V5V7|b out! 
V3 — bV7|b new rule 
V2 > V4Vola out! 
V2 > a Vela new rule 
Vi > VaVola out! 
Vi — a Vela new rule 
V4—-a 
V5 > b 
Vo > b V7V4\b V4 
V7 > a V6Vsla V5 


The grammar is now in GNF: 


V; > aVola 

V2 > aVeola 

V3 — bV7|b 

V4 a 

V5 > b 

V6 — bV7V4|bV4 
V7 > aV6V5laVs. 


EXAMPLE 10.7.5: Bring the grammar G with V = {S,A, B}, T = {a, b} and productions P 


S — AB 
A — BSB 
A->a 
Bb 
into Greibach NF. 
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Solution: 


Step-1: Simplify G 
There are no useless variables or productions, no €-productions, no unit- 
production. Nothing to do here. 


Step-2: Transform G into an equivalent grammar G' in Chomsky NF 
S — AB 
A — BSB 
A->a 
B—>b 


a. There is no terminal on R.H.S., together with other variables or terminals. 
Nothing to do here. 


b. Break down the rules. 
Modify the A-rule as: 


A — BSB (introduce variable D, and “break down” the rule) 
A>BD, 2newrules 
D, — SB 
c. Grammar G’ is in Chomsky NF: 
S-AB 
A>BD 
Di; > SB 
A->a 
B-—b 
Step-3: Transform G’ into an equivalent grammar G' in Greibach NF 
Rename the variables in the productions P’ to Vj, V2 ...: 
S = Vi;A = V2;B = V3;D; = V4 
Vi > V2V3 
V2 — V3Va4la 
V3 — b 
V4 > ViV3 
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Step-4: Order the productions and remove recursion where necessary 
Vi > VoV3 okay. 
V2 > V3V4la okay. 
V3 —>bD okay. 
V4 > ViV3 needs to be modified. 


a. Modify V4-rule 
m Substitute R.H.S. of V; in V4-rule: 


V4 —> V2V3V3 
@ Substitute R.H.S.s of V2 in modified V4-rule: 
V4 > V3V4V3V3|aV3V3 
m Substitute R.H.S. of V3 in modified V4-rule above: 
V4 > bV4V3V3)aV3V3 new V4 — rule 
All the rules are now ordered properly: 
Vi > V2V3 
V2 — V3Va4la 
V3 —> b 
V4 > bV4V3V3|aV3V3 
Step-5: Substitute backwards to achieve Greibach Normal Form 
V4 > bV4V3V3|aV3V3 okay. 
V3—b okay. 
V2 > V3V4la needs to be modified, substitute V3 
Vo — bVala new V>-rule 
Vi > V2V3 needs to be modified, substitute V2 
Vi > bV4V3\aV3 new V,-rule 
All the rules are fine now. Grammar is in Greibach NF. 
V, > bV4V3|aV3 
V2 —> bVala 
V3 b 
V4 > bV4V3V3|aV3V3 
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EXAMPLE 10.7.6: Find a GNF grammar, equivalent to the following CFG. 


S = AAO 
A> SS\I 


Solution: 


Step-1: Simplify G 
There are no useless productions, no €-productions, no unit-productions. Nothing 
to do here. 


Step-2: Transform G into equivalent CNF. 
CFG is already in CNF. 


Step-3: Transform G into equivalent grammar G’ in GNF Rename the variables in the 
Productions P’ to Aj, Ao,.... 


S=A,;;A=A2 
Aj — A2A2|0 
Az — AjAj|1 


Step-4: Order the productions and remove recursion, wherever necessary. 


A, — A2A2|0 No Modification needed 
Az — AjAj|I1 Needs to be modified. 


a. Modify A2-rule 
Substitute RHS of A; in Az-rule. 


A2 — A2A2A1(0A)|1 
All the rules are in proper order. 


A, > A2A2|0 
Az — A2A2A\|0A1|1 
b. ‘Resolving Left recursion 


Az — A2A2A1|0A;|1 needs to be resolved. 


Introduce a new variable B and a new production B > aj|a1B.... 
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Now, 
Az — A2A2A;|0Aj|1 is nothing but, 
A2 —> A2A?2Aj|1 
Az — 0A;]1 
Consider, a; = A2A,; => B — A2A\ and B > A2A,B 
(.°B— a, and B > a,B). 
Substituting B into the RHS of A? rules which is recursive: 


A2 — A2B|1B. (.a=1=>B-— 1B) 
Again, substitute for Az in A2-rule. 
A2 — OA;Bi1B 
Now, rules without recursion are: 
A, — A2A2|0 
A2 — 0A;B\1B 
A2 —> 0A;|1 


B-— AA 
B— A2A,B, 
or the production rule in a compact form is: 
Aj => A2A2|0 
Az — 0A;B\1B}0A;}1 
B— A2A\|A2A1B 
Step-5: Substitute to achieve GNF 
Ay — A2A2|0 needs to be modified. 
A2 — OAjB|1BOA;|1 
B —> A2A\|A2A1B needs to be modified. 
a. Modify A,-rule: Substitute the R.H.S of Az in A;-rule (only for starting variable in 
A,-rule): Ay — 0A;BA?2|1BA2|0A1A2|1A2|0. 
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b. Modify B-rule: Substitute the R.H.S of A> in B-rule: 
B — 0A BA,|1BA;|0A,A;|1A;|0A;BA;B|1BA;B|OA;A;BI1A,B. 
All the rules are in perfect order and the grammar is in GNF: 
A — 0A; BA?2|1BA2|0A;A2|1A2/0 
Az — 0A,B|1B)0A;|1 
B — 0A,1BA,|1BA,|0A1A;|1A1|0A;BA;B|1BA;B|OA)A,Bi1A1B 
EXAMPLE 10.7.7: Transform the CFG into GNF, given G = ({A1,A2,A3}, {a, b}, P,A1) 
and productions P as, 
Aj > A2A3 
A2 — A3A\|b 
A3 — AjA2I\a. 
Solution: 
Step-1: Simplify G 
There are no useless productions, no €-productions, no unit-productions. Nothing 
to do here. 
Step-2: Transform G into an equivalent CNF. 


G is already in CNF. Nothing to do here. 
Step-3: Transform G into an equivalent GNF. 


Order the productions and remove recursion, wherever necessary. 
A, — A2A3 okay 
A2 — A3A\|b okay 


A3 — AjA2\a Needs to be modified. 
a. Modify A3-rule: 
@ Substitute R.H.S. of Ajin A3-rule: 
A3 — A2A3A2\|a. Needs to be modified, 
@ Substitute R.H.S. of Azin A3-rule: 
A3 — A3A1A3A2|bA3A2I|a. 
Now all the rules are ordered properly. 
Al > A2A3 
A2 — A3A,|b 
A3 —> A3A1A3|bA3AQla. 
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b. Resolve left recursion. 


Hence, A3 — A3A1A3A2|bA3A2|a needs to be resolved. Introduce a new variable B 
and a new production B > aj|a,|B...... 
Consider, a = A,A3A2 


Boa=>B- A,A3A2 
B—->aB=>B-— A;A3A2B 
or B-> AjA3A2|A}A3A2B. 
Substituting B into the RHS of A3-rules, which is recursive: 
A3 — A3B\bA3A2|aB C.a=a>B- aB,) 
Again, substituting for A3 in A3-rule: 
A3 — bA3A2B|aB|bA3A2\a. 
Now rules, without recursion, are: 
Al > A2A3 
A2 > A3A,|b 
A3 — bA3A2B|aB\|bA3A2\a 
B — AjA3A2|A1A3A2B. 
Step-4: Substitute backward to obtain GNF 
A, — A2A3 needs to be modified. 
Az — A3A1|b needs to be modified. 
A3 — bA3A2B\aB\bA3A2\a 
B — A,A3A2|A,;A342B needs to be modified. 


a. Modify A2-rule: 
Substitute the RHS of A3 in A2-rule (only for starting variable of A2-rule). 


Az — bA3A2BA;|aBA,|bA3A2A,|aAq\b. 


b. Modify A;-rule: 
Substitute the R.H.S. of A2 in the A ;-rule (only for starting variable of A;-rule) 


A, — bA3A2BA1A3|aBA1A3|bA3A2A1A3|aA1A3|DA3. 


412 


Downloaded from https://www.cambridge.org/core. Stockholm University Library, on 06 Dec 2018 at 08:02:55, subject to the Cambridge Core terms of use, available at 
https://www.cambridge.org/core/terms. https://doi.org/10.1017/UPO97881 75968363.011 


Simplification of Context-Free Grammar 


c. Modify B-rule: 
Substitute R.H.S. of A; in B-rule (only for the starting variable of B-rule). 


B —bA3A2BAA3A3A2|bA3A2BAA3A3A2B.| 
aBA,A3A3A2|aBA1A3A3A2B| 
bA3A2A1A3A3A2|bA3A2A1A3A3A2B.| 
aA ;A3A3A2|aA1A3A3A2B| 
bA3A3A2|bA3A3A2B. 

All the rules are properly ordered. Hence, the grammar is in GNF. 
A, — bA3A2BA,A3|aBA1A3|bA3A2A1A3|aA1A3|bA3. 
A2 —bA3A2BA1|aBA,|bA3A2A1|aAq |b. 

A3 —bA3A2B|aB|bA3A2|\a. 

B — bA3A2BA1A3A3A2|bA3A2BA1A3A3A2B| 
aBA,A3A3A2|aBA;A3A3A2B| 
bA3A2A1A3A3A2|bDA3A2A1A3A3A2B.| 
aA,A3A3A2|aA1A3A3A23 
bA3A3A2|bA3A3A2B. 


10.8 Chomsky vs. Greibach Normal Form 


Chomsky Normal Form. 


a CNF is named after Noam Chomsky. 

m CNF puts restrictions on number of symbols, on the right side of production. 
wg CFG is in CNF, only if all productions are of form A > BC or A > a. 

m CNF is important because it yeilds efficient algorithms. 


Greibach Normal Form 


m GNF is named after Sheila Greibach. 
= GNF puts restrictions not on the length of R.H.S., of production, but on the positions, 
in which terminals and variables can appear. 
m CFG is in GNF, only if all the productions are of the form A —> ax. 
@ GNF is used 
a. to prove that every CFL can be accepted by non-deterministic PDA. 
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b. to construct a PDA equivalent of CFG. 
c. and because it is convinent to use in parsing. 


Note-2: 

Kuroda Normal Form (KNF) 

A CFG is in KNF, if and only if, all the productions are of the 
form 


AB > CDorA — BCorA—> BorA—>a 


where, A, B, C and D are nonterminals and a is a terminal. 


10.9 Application of Context-Free Grammars 


10.9.1 Parsing 


Parsing is nothing but finding a sequence of productions by which the string W in L(G) is 
derived. Thus, parsing a string W is finding a derivation for that string i.e., a sequence of 
applications of production rules, in which starts with S and ends in W. 

Parsing a string is like recognising a string. An algorithm to recognise a string will 
give only a yes/no as answer. However, an algorithm to parse a string will give additional 
information about how the string can be formed from the grammar. Generally, the only 
realistic way to recognise a string of a context-free grammar is to parse it. 


10.9.2 Design of programming languages 


CFG plays an important role in the design of programming languages. The use of balanced 
parentheses or the arithmetic expressions or the conditional expressions along with various 
operators, can be easily expressed using CFG. To describe the programming languages, the 
CFG uses BNF notations. 

As discussed earlier, Backus-Naur Form or BNF is a compact notation to represent 
the production rule. CFG uses BNF for defining (the syntax of) context-free parts of 
programming languages. The advantages of using BNF is that it uses a small number 
of symbols (distinct from those) used in programming languages to define programming 
languages. The extended BNF notation mainly consists of: 


a. Angular brackets - <> 
This is to denote variables. 
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Example: - < identifier > denotes the class of identifiers. 
b. i: 
It is used to denote ‘is defined as’ (alternate to > in production rule). 


Example: < identifier >::=< letter > (< letter > | < digit >) 
This states that an identifier is defined by writing a letter, followed by zero or more 
number of letters and digits. 


6? 649 


c. °7, ‘x 
It means ‘or’, ‘zero or more number of times’. 


d. [] 


It denotes zero or one. 


Example: [< digit >] denotes zero or one of < digit >. 
Thus, with extended BNF all non-terminals are written within <> and terminals are 
written without <>. The ‘—’ in the production is replaced by ‘:: =’. 


———————————————— EL 


John Warner Backus (born in December 3, 1924) is an American computer 
scientist, notable as leader of the team that invented the first high-level programming 
language (FORTRAN), inventor of, the Backus-Naur form (BNF, the almost universally 
used notation to define formal language syntax) and also the concept of function- 
level programming. He received his B.S. and M.S. in Mathematics from Columbia 
University. 


He received the W.W McDowell Award From the IEEE in 1967, the National 
Medal of Science in 1975, the A.M Turing Award from the Association for Computing 
Machinery in 1977 and an honorary doctorate from York University, England in 1985. 


Peter Naur (born in October 25, 1928) is a Danish pioneer in Computer Science 
and winner of Turing Award. His last name is the N in the BNF notation (Backus- 
Naur form), used in the description of the syntax for most programming languages. 
He contributed to the creation of the ALGOL 60 programming language. 


He received his M.A in astronomy from Copenhagen University in 1949 and Ph.D., 
in astronomy from the same University in 1957. In 1963, he was given the Hagemanns 
Gold Medal and three years later the Rosenhjaer Prize. 


His main areas of inquiry are design, structure and performance of computer 
programs and algorithms. He has done pioneering works in areas such as software 
engineering and software architecture also. 
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EXAMPLE 10.9.1: Consider the grammar 


id — letter.rest 
rest — letter.rest|digit.rest. 
letter > alb...|z 
digit — 0|1..|9. 
The BNF notation for this grammar is: 


< id > ::=< letter >< rest > 
< rest > ::=< letter >< rest > | < digit >< rest > 
< letter > ::=a\b...|z 


< digit > ::= 0|1..|9. 
EXAMPLE 10.9.2: Consider the grammar: 


E> E+KI\K. 
K>Kx2Z|Z 
Z — (E)jid. 
The BNF notation for this grammar is: 

<E>u=<E>+<K>|<K>. 

<K>u=<K>*<Z>|<Z> 

<Z>:=(<E >)lid. 

Some of the examples, of how CFG is a useful tool for defining programming languages, 


are: 
a. Definition of C-Type small language 


Consider the basic elements of C-type Programming Language. 


1. Statement(st): Assignment statement(Ast), compound statement(cst), selection 
statement(Sst), iteration statement(Ist). 


2. Tokens: keywords, constants(const), strings(str), Literals(Lt), operators(oper), 
identifiers(id), special symbols. 


3. Literals: Integer-value(int), boolean-value(boln), real-value(float). 


416 


Downloaded from https://www.cambridge.org/core. Stockholm University Library, on 06 Dec 2018 at 08:02:55, subject to the Cambridge Core terms of use, available at 
https://www.cambridge.org/core/terms. https://doi.org/10.1017/UPO97881 75968363.011 


Simplification of Context-Free Grammar 


4. Operators: Arithmetic operators(Aoper), Relational operators(Roper), Logical 
operators(Loper) 
5. Selection statements: if, switch 


6. iteration statements: while, for 


The grammar to generate a C-type small language is given below: 


<st >= < Ast >|<st>| 


< Sst > | < Ist > 


< Ast >::=< id >=< expressions >; 


< Sst >:: = if (< logical-exp >) < st > | 
if (< logical-exp >) < st > else 
<st> | 


switch (< expression >){< cases >} 


< logical-exp >:: =< comparison > | < comparison > && < logical-exp > | 


< comparisons|| < logical-exp > 


< comparison >::=(< Boolean-operand >< Roper > 


< Boolean-operand >) 


< Boolean-operand >::= true/false| < id > 


< expression >::= < factor > | < expression > + < factor > | 


< expression > — < factor > 
< factor >::=< operand > | < factor >*< operand > 
< operand >::=< integer > | < id > |(< expression >) 
< Roper >:=> | >=|<|<=|=|!= 
< Loper >::= ss||||{! 
< cases >:!:=< case >< cases > | < Default > 


< Default >::= default: < statement-seq > 
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< Ist >::= while (< logical-exp >) < st > | 
do < st > while (< expression >)| 
for ([< expression >]; [< expression >]; 
[< expression >]) < st > 
Describing tokens: 
< token >::=< id > | < keyword > | < Lt > | < separator > | < oper > 
< id >::=< letter > (< letter > | < digit >)* 
< letter >::= alb|...ly|zJA|B}..|Y|Z 
< digit >::= O[1|...{9 
< keyword >::= if |else|while|int|fioat|main|char 
< literal >::=< int > | < boln > | < float > 
< int >::= (digit)+ 
< boln >::= true/false 
< float >::=< decimal > 
< decimal >::=< signed-integer > . < integer > 
< Aoper >::= +| — | * |/|%. 
b. Definition of the parts of HTML 
< HTML-Document >::=< html >< document >< /html > 


< Document >::= < head >< headpart >< /head > 
< body{body-attributes}* >< bodypart >< /body > 
< headpart >::= [< title >< titlepart >< /title >] 
< titlepart >::=< string > 
< string >::= [< letter >][< letter > | < digit >]* 
< digit >::= 0|1..|9 
< letter >::= alb..y|zjA|B|../Y|Z. 
< body-attributes >::= [background = “background-value”][bgcolor = “color”] 


<body-part> ::= <comment>|<imge>|<textual>| < line-break > | < linking > 
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< comment >::=<!- < body-part > -- >> 
< textual >::= [< text >] < body-part > 
< text >::=< string > 
< line-break >::=< br >< body-part > 
< imge >::=< img src = “path” > 
< linking >::=< ahref = “path” >< linked-item >< /a > 


< linked-item >::=< string > | < image >. 


10.10 Exercises 


1. Define useful and useless symbols in CFG. 
2. Identify useful and useless symbols for the grammar shown below: 


a. S — aSa|bSb|A 
A — aBb\|bBa 
B — aB|bB| € 


b. S— aA|bB 
A-— aAla 
B-— bB 
D — ab|Ea 
E— aC\d 


3. What is reduction of grammar? Write a general procedure to reduce a given grammar. 


4. Obtain a reduced grammar for the grammar shown below: 
S — aAa 


A — Sb|bCC|aDA 
C > ab\aD 
E->acC 


D— aAD. 
5. What is an €-production? Explain nullable variables with examples. 
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6. Identify the nullable variables from the grammar shéwn below: 
S — BAAB 


A — OA 2|2Ao| € 
B-> AB\IB\ e. 
7. For the grammars shown below, eliminate the €-productions. 


a. S — ABAC 
A aAle 
B- bBlé 
C->c. 


b. S > aSa|bSblA 
A — aBb\|bBa 
B— aB\bB\e. 


c. S > aAla|B/C 
A— aBle 
B-aA 
C —>cCD 
D = abd. 


8. What is a unit production? Give a general method of eliminating unit productions, 
from the grammar. 


9. Eliminate the unit productions from the following grammars: 


a. S— AB 
A->a 
B>C\|b 
C-D 
D-> E|bc 
E — alAb. 

b. S — AO|B 
B->Alll 
A —> O|12|B. 


c. S > Aal|B|Ca 
B— aB\b 
C — Db|D 
D— Eld 
E - ab. 
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10. What is Chomsky Normal form? Give the general procedure to transform a grammar 
to Chomsky Normal Form. 


11. Simplify the following CFG and convert it into CNF. 


a. S — AaBlaaB 
A-é 
B— bbA\é. 


b. AE > AE+T|T 
T—>T*E\E 
E —> (AE)|I 
I > alb\c\Ia|Ib\Ic. 
12. What is Greibach Normal form? Give the general procedure to transform a grammar 
to Greibach Normal Form. 


13. Convert the following grammars into GNF. 


a. S — AB1|0 
A — OOA|B 
B= |Al. 


b. A BC 
B— CA|b 
C — ABla. 


14. Bring out the differences between CNF and GNF. 
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‘Pushdown lethargy in a context-free environment, then is available an unlimited amount 
of resources—’ 


Introduction 


The applications of regular languages are very important in the field of computer science. 
However, many useful regular languages which are not regular cannot be accepted by DFAs 
and NFAs. For example, the language consisting of nested, balanced parentheses is not 
regular and hence is not accepted by a DFA, where 


L= {e, 0,00, (0), (O)),++ +}: 


In fact, this language is useful in programming languages for the purpose of nesting 
expressions and program blocks-the reason for the non-acceptance of this language by 
a DFA being that the storage of information will be done in its current state only. Hence, for 
sufficiently long inputs, the machine loses track of the pattern of the parentheses suggesting 
that there exists a limit for the memory of DFAs. 

A pushdown automaton PDA is similar to FSA, except that PDA has an auxiliary stack 
which provides an unlimited amount of memory. A language L is recognised by a pushdown 
automaton, iff L is context-free. 

Thus, pushdown automaton accepts a rich class of languages (which may be regular or 
non-regular) with an unlimited memory capacity, in the form of stack. PDA contributes 
mainly in the area of parsing and compiler construction. 


Definition: 
There exist the following equivalent definitions for the concept of PDA: 


a. PDA is a way to represent the language class called context-free languages. In other 
words, PDAs are abstract devices that recognise context-free languages. 

b. PDA is a generalisation of FSA and a PDA changes from state to state, reading input 
symbols. Unlike FSA, transitions also update the stack either by popping symbols or 
by pushing them. 
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11.1 Components of PDA 


Figure 11.1. Components of PDA 


The PDA is usually described as consisting of four components: 


@ Input Tape 

@ Read unit 

@ Control unit 

m@ Stack (Memory unit) 
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11.1.1 Input Tape 


It is an infinitely long tape, on which input is written. 


The tape is divided into sequence of cells. Each cell begins from the left end and extends 
to the right, without an end. 


Each cell of the tape holds one input letter or a blank, €. 
Input string is written on to the tape, prior to the beginning of the operation of the PDA. 


The tape has read-only head, which always moves to the right, one cell at a time. It 
reads a symbol and cannot go back. 


11.1.2 Read Unit 


Read unit of PDA reads words from the cells of the input tape, beginning with the first letter 
in the leftmost cell, and then moves to the right. However, it cannot go back. 


11.1.3 Control Unit 


It governs the operations of PDA by performing a sequence of transitions between 
internal states available to it. 

The control unit executes a transition, whenever a letter of the input string is provided to 
it by the read unit. These transitions are determined by the transition function of PDA. 


Internal states available to the control unit will have START, ACCEPT and REJECT 
states. 


11.1.4 Stack 
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A PDA has an infinitely tall PUSHDOWN STACK, which has a Last-in-First-Out 
(LIFO) discipline. 


Stack always start with STACK empty. 
Stack can hold letters of STACK alphabet which can be the same as input alphabets. 
Usually, an initial stack symbol is placed on the top of the stack. 
The primitives, that are used to write to the stack, are: 
a. Push — Adds the input alphabet to the top of the STACK. 


b. Pop — Removes the top input alphabet from the top of the stack. If the stack is 
empty, then a basic pop does not change the state of the stack. 


c. nop — does nothing to the stack. 
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11.2 Description of PDA 


There are different ways to describe the task of PDA, as follows: 


11.2.1 Transition Diagram 


In a directed graph, for an arc going from the vertex which corresponds to state p, to the 
vertex that corresponds to state q, the edge labelling can be represented in different forms 
as follows: 


fartied: (input symbol, top input symbol of stack) 
° perations on stac 


(input symbol, top input symbol of stack) 


P Operations on stack @) 


Figure 11.2. Edge Label Format 


form-2: | (p, input symbol, top symbol fo stack, operation on stack, q) 
() (p, input symbol, top symbol of stack, operations, q) @ 


Figure 11.3. Edge Label Format 


form-3: | input symbol, Pop old stack symbol | Push new stack symbol 
input pop old / push new 


(?) symbol, stack symbol stack symbol 


Figure 11.4. Edge Label Format 


The transition diagram of PDA incorporates the following operations: If a PDA is in state 
p, the input symbol is any letter of a string w (the input symbol can be € also) and ‘s’ is the 
top element of stack, then PDA 
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a. executes the stack operation (push|pop|nop) 
b. moves to state g, and 
c. if the input symbol <e, then it goes to the right (i.e. the next cell of the tape). 
EXAMPLE 11.2.1: (Edge labelling in transition diagram) 
a. a, 
Oro 


it 2) | 


initial stack 
symbol 


Figure 11.5. Transition Diagram Showing Edge Lable Format-1 


( nats) means, with the current state as p, the control goes to the state g by reading 


the input symbol ‘a’. Further, with the current stack top ‘g’, it pushes x onto the stack. 


b. (?) (p, 4, g, push(x), g) (2) 
> __—_—_—_———> 
input | | 


Figure 11.6. Transition Diagram Showing Edge Label Format-2 
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° ea 
—_— 


Figure 11.7. Transition Diagram Showing Edge Label Format-3 


Here the PDA goes from q; to q2 by reading an input symbol ‘a’, pops the top symbol 
from stack ‘(b)’ and finally, pushes the new symbol ‘c’ onto stack. 


11.2.2 Transition Table 


The description of operation of a PDA, for a given input string, can be represented in a 
tabular format called transition table. The table format is shown below. 


Unread input | Transition 


Table 11.1 Transition table format 


EXAMPLE 11.2.2: Consider a PDA, whose task is described in the transition diagram shown 
in figure 11.8: 


4.29 aa 
push(a) push(a) 


an 
(@) 4 @) 


Figure 11.8. Transition Diagram of the PDA 
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The transition table for the above PDA is given in table 11.2. 


Unread input Transition New state 


aab - go 
ab (go, 4, Zo, push(a), go) qo 
b (qo, a, a, push(a), go) qo 
S (go, b, a, pop, qi) q\ 
Table 11.2 Transition table for w = aab with zo as initial symbol on stack 


The transition function of PDA is discussed in the next section. 


11.3 Elements of PDA 


A PDA constitutes the following seven characteristics: 


a. A finite set Q, of states go, q1,---.9n- 

b. A finite input alphabet of letters &£ = {a,b,---}, for forming the input string. 
c. A finite set ‘T’ of stack symbols. 

d. A transition function ‘S’, which tells how PDA goes from one step to the next. 
e. An initial state go. 

f. A symbol Zo, indicating the top of the stack. 

g. A set of final states F. 


11.3.1 Ordered Seven-Tuple Specification of PDA 


Formally, a PDA is a seven-tuple. 
M= (Q, 2,1, 64, 0: 20, F) 
where, 
Q is the set of finite states of PDA. 
X is a finite set of input symbols. 
. Tis the finite set of stack symbols. 
- go € Qis the initial state. 
. FC Qis the final state. 
. 2g is the initial stack symbol, placed on the top of the stack. 
. 61s transition function of PDA and is defined as 


Qx(xUe)xTtoQgxI™. 
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11.3.2 Transitions of PDA 


The transition of PDA can be represented in different ways, as follows: 


Form-1: 
6(current state, current input symbol, current stack top) = (new state, new stack top). 
Form-2: 


(current state, current input symbol, current stack top, operation on stack, new state) 


EXAMPLE 11.3.1: (Transition format) 


Figure 11.9. Transition Diagram 


Transitions of the above diagram are represented using different forms as: 

Form-1: (qo, 4, Zo push(a), go) 
This means for the current state go, current input symbol a, if the current stack 
top is zo, then push ‘a’ onto stack and remain in state go. 

Form-2: 5(qo, 4,20) = (qo, 4) 
This means for the current state go, current input symbol ‘a’, if the current stack 
top is Zo, then new state and new stack top are represented as (qo, a). 

Similarly, other transitions of above diagram are: 

Form-1: (qo, a,a push(a), go) 

Form-2: 65(go,a,a) = (qo, 4a) 

Form-1: (qo, b, a, pop, 41) 

Form-2: 6(qo,0,a) = (q1,e) 

where ‘e’ is used to indicate pop ‘a’ (current stack top) from the stack. 


11.4 The Language Accepted by PDA 


There are two ways to describe the acceptance of a language: 


a. Define the language accepted to be the set of all inputs, for which some sequence 
of moves causes PDA to empty its stack. This language is referred to as language 
accepted by empty stack. 
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b. Designate some state as final state and define the accepted language as the set of all 
inputs, for which choice of moves causes PDA to enter a final state. 
In other words a language is accepted by PDA, if the machine attains the form 
‘(final state, end of input, empty stack)’. 
Formally, let. M = (Q, X,T,5, qo, Zo, F) be a PDA. The language accepted by M is 


L(M) = {w € *|(qo,w,Z) R-M(P, €, K), P € F and K €T}. 


11.5 Instantaneous Description (ID) of PDA 


ID of a PDA is defined as a triple (qg, w, s), where q is the current state, w is the remaining 
input and s is the current stack contents. 


(q » wo» S$) 
| \ 
Current state Remaining inputs Current stack contents 


Formally, let M = (Q, X,T,,6, qo, Zo, F) be a PDA, then the ID of a PDA is: 


(q, aw, za)——(p, w, Ba). 


EXAMPLE 11.5.1: Consider the following transition diagram of a PDA. 


ee. bg 
ta) Push pop 
ow SS (a) 


= ba C_) S220 
oe = 


Figure 11.10. Transition Diagram PDA 
Consider the input string w = aaabbb and zo as the current stack top. 
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Unread input Transition New state 


aaabbb 


aabbb | (qo, 4a, Zo, push(a), go) 
abbb | (qo,a,a, push(a), go) 
bbb | (qo, a, a, push(a), go) 
bb | (qo, b,a, pop, 91) 
b}  (q1,,4, pop, 91) 
€| (q1,5,a, pop, 91) 
(41, €, Zo, NOP, 42) 


Table 11.3. Transition table 
IDis: (go, aaabbb, zo) /—(qo, aabbb, azo) 
|+—(qo, abbb, aazo) 
|+—(qo, bbb, aaazo) 
t—(q1, bb, aazo) 
t—(q1, 5, azo) 
-t—(q1, €, Zo) 


L—(q2, zo) > (qo, aaabbb, 20) — (qx. 20){—2pt] 


Since, the final state q2 is reached, stack is empty and the input symbol is consumed (input 
is €). Thus, the language is accepted by PDA. 


11.6 Design of PDAs 


The basic design strategy for PDA is as follows: 


Understand the language properties, for which the PDA has to be designed. 
Determine the state and alphabet set required. 

Identify the initial, accepting and dead states of PDA. 

Decide on the stack symbols required. 

Determine the initial stack symbol from the stack symbol set. 

For each state, decide on the transition to be made for each character of the input string. 
For each state transition, decide on the stack operation to be performed. 

Obtain the transition diagram and table for PDA. 

Test, the PDA obtained, on short strings. 


remo ao gp 


431 


Downloaded from https://www.cambridge.org/core. Stockholm University Library, on 06 Dec 2018 at 08:04:03, subject to the Cambridge Core terms of use, available at 
https://www.cambridge.org/core/terms. https://doi.org/10.1017/UPO09788175968363.012 


A Textbook on Automata Theory 


EXAMPLE 11.6.1: Design a PDA to accept the language 


L= {wCw*|w € (0+ 1)*} 
by an empty stack. 
or 


Design a PDA that accepts the language of odd palindrome. 


Design Strategy: The PDA must operate in the following way. As it reads the first half of its 
input, it remains in its initial state go and pushes all the symbols from the input string onto 
the stack. When the machine sees a ‘C’ symbol in the input string, it switches from state qo 
to state gq; without operating on stack. Now this causes the removal of the top symbol on the 
stack, provided that it is same as the next input symbol. If the input symbol does not match 
the top symbol on the stack, then no further operation is possible. If automaton attains the 
form 

‘(final state, end of input or €, empty stack)’ 

then the input is of the form wCw* and PDA has accepted the input string. On the other 
hand, if automaton detects a mismatch between input and stack symbols, or if input is 
exhausted before the stack is emptied, then it does not accept the string. 


(W) (W*) 
First-half input Second-half input 


input string: 


| <—— pusi—>| | <——— POP -—-——> | 
Push all input 
symbols onto stack rl ee 


and remain in g, as the next input symbol 


Once c is encountered, 
change from q, to q, 


Figure 11.11. Design View of PDA 


Following are the transitions required: 


i. Read each symbol of the input string and push onto stack before C is encountered. 
a. (go; €, Zo, NOP, qo) 
b. (go, 0, zo, push(0), go) 
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c. (go, 1,z0, push(1), go) 
d. (go, 0, 0, push(0), go) 
e. (go, 1,1, push(1), go) 
f. (go, 1,0, push(1), go) 


g- (go0,0, 1, push(0), go) 
ii. Once ‘c’ is encountered, change to gq, without performing any operation. 
h. (qo, C,0, nop, q1) 
i. (go, C, 1, nop, q1) 
j- (qo, C, zo, nop, q1) 
iii. Read the next input symbol in the second-half and pop from the stack, if there is a 
match. 
k. (q1,0,0, pop, 41) 
1. (q1,1, 1, pop, 41) 
iv. Once the stack is empty or the input is €, change to qo. 
m. (41, €, 20, NOp, 42) 


Thus, PDA = {{q0, 91,92}; {0, 1}{0, 1, zo}, 5, go, Zo, 2}, where 4 is given by: 
Transition diagram: 


0, 0 
0,1 push(0) e,z 11 0.0 
push(1) nop ace nop 
1, 1 0, Zo 
push(1) push(0) 
1,0 1, Z 
push(1) push(l) 6,0 G1 C.% E, 2 


Figure 11.12. Transition Diagram, L = {wCw®|w € (0 + 1)*} 


PDA action for the input string: 
Consider the input string w = 001C100, whose description is given by: 


Transition Table: 
Unread input | Transition Stack | New State 
001 C 100 - zo qo 
01 C 100 | (go, 0, zo, push(O), go) | 920 qo 
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| 1C100| (qgo,0,0, push(0), go) 
(go, 1,0, push(1), go 
(go,C,1, nop, qi) | 10020 | 41 
(q1, 1,1, pop, 91) 00zo | 41 | 
0] —(q1,0,0, pop, 91) 0zo | 41 | 
€| (q1,0,0, pop, qi) Zo | 41 
— | G1. €, Zo, nop, q2) Zo | 92 


Table 11.4 Transition table for L={wCw®|w € (0 + 1)*} 
a. To show that w = 001C100 is accepted by PDA ID is: 

ID is: (go,001C100, zo) K-—(go, 010100, 0z0) 
+—(qo, 1C 100, 0029) 
t—(go, C100, 100z9) 
+—(q1, 100, 100z0) 
-—(q1, 00, 0020) 
-—(q1, 0, 020) 
-—(q1, €, Z0) 
+—(q2, Zo). 


Since the final state is q2 and the stack empty zo is reached, it follows that w is accepted 
by PDA. 


b. To show that w = 01C00 is rejected by PDA 
ID: (qo,01C100, zo) -—(go, 1C 00, 020) 
t—(qo, C00, 1029) 
t—(q1, 00, 1029) 


The transition is not defined for the above configuration (q1,0, 1), hence the string w is 
rejected by PDA. 


EXAMPLE 11.6.2: Construct a PDA to accept L = {ww*|w € (0 + 1)*} 
or 
Build a PDA that accepts the language of even palindrome. 
Design Strategy: The PDA must operate in the following way. As it reads the first half of 
its input, it remains in its initial state go and pushes all the symbols from the input string 
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onto the stack. When the machine sees the first symbol of the string w®, then it is the end of 
first half of its inputs. Now, the machine changes from state qo to state q; without operating 
on stack. For the second-half of inputs, each input symbol has to be matched with the top 
symbol on the stack. If there is no match, then no further operation is possible and we say 
that the input is rejected by the machine. If there is a match, then the top symbol on the 
stack is removed and then the next input string is compared. This process continues and we 
say that the input is of the form ww* and L is accepted. 


Ww we 
Frist-half input Second-half input 


input string 


push all input pop top symbol from 
symbols onto stack and stack, if it is same 


remain in qo as next input symbol 


Middle input symbol 
once encounter 1“ symbol of 
W* change from gp to q; 


Figure 11.13. Design View of PDA 


Following are the transitions required: 


i. Read each symbol of the input string and push onto stack, before the first symbol of 
the string w*® is encountered. 


(qo, €, 29, nop, 90) 
(qo, 0, 20, push(0), 90) 
(go, 1, zo, push(1), go) 
(go, 0, 0, push(0), 90) 
(40, Lf push(1), 90) 
(qo, 1,0, push(1), go) 
g. (qo, 0, 1, push(0), 90) 


ii. Once the first symbol of the string w* is encountered, perform the following to change 
from qo to qi. 
h. (go. €, 0, nop, 41) 
1. (go, €, 1, nop, qi) 
j- (Go, €, Zo, nop, 41) 


moans pe 
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iti. Read the next input symbol in the second-half and pop from the stack, if there is a 
match. 
k. - (41,0, 0, pop, 91) 
1. (qi, 1,1, pop, qi) 
iv. Once the stack is empty or the input is ‘e’, then change to qo. 
m. (41, €,Z0, NOp, g2) 


Thus, PDA = {{q0, 41,92}, {0, 1}, {0, 1, 20}, 5, go, Zo. ¢2}, where 4 is given by: 


Transition diagram: 


0,0 
0,1  push(0) €, 29 11 0.0 
push(1) nop ioe Bop 
1,1 0, Zp 
push(1) push(0) 
1,0 1, z9 
push(1) push(l) 9 €,1 ez E, 2 


Figure 11.14. Transition Diagram 
L = {ww lw € (0+ 1)*} 
PDA Action for the input string: 


Consider the input string w = 0110, whose description is given by: 
Transition table: 


Unread input Transition Stack | New State 
0110 - | 20] qo 
110 | (go, 9, Zo, push(Q), go) | 920 qo 
[ 10 | (go, 1,0, push(1), go) | 10zo qo 
10 (go, €, 1, nop, qi) 10z9 1 
0} (go, 1,1, pop, q1) 0z0 q1 
| _ (40,0, 0, pop, 91) zo; 
— | @o, €, Zo, nop, g2) Z0 q2 


Table 11.5 Transition table for L = {ww®|w € (0 + 1)*} 
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a. To show that w = 0110 is accepted by PDA 


ID: (qo, 0110, zo) (go, 110, 020) 
t—(qo, 10, 1020) 
t—(q1, 10, 1029) 
t—(q1, 0, 020) 
t+—(41, €, Z0) 
-+—(42, Zo). 
Since the final state is g2 and stack empty Zo is reached, it follows that w is accepted 


by PDA. 
b. To show that w = 0100 is rejected by PDA 


ID: (go, 0100, zo) --—(qo, 100, 029) 
t+—(qo, 00, 1020) 
K—-(q1 9 00, 10z0) 


The transition is not defined for the above configuration, hence the string w is rejected 
by PDA. 


EXAMPLE 11.6.3: Obtain a PDA to accept the language L = {a"b"|n > 1}. 

Design Strategy: The PDA must operate in the following way. As it reads the input symbol 
‘a’, it pushes the symbol onto stack top and remains in the state go. This process is continued 
if the next input symbol is also ‘a’. Once an input symbol ‘b’ is encountered, then the machine 
switches from go to gi and this causes the removal of the top symbol on the stack. In the 
state q1, for every ‘b’ encountered in the input symbol, pop operation (with stack top as ‘a’ 
is performed. If the machine reaches the final state or it is the end of input or empty stack, 
then the input is in the form a”b”. 


Following are the transitions required: 


i. Read each symbol of the input string i.e., ‘a’ and push onto stack, until ‘b’ is 
encountered. 

a. (90,4, 20, push(a@), go) 
b. (qo,4,4, push(a), go) 

ii. Once ‘b’ is encountered, change to gq; by performing pop operations. 
c. (qo, 5,a, pop, qi) 
d. (1,5, a, pop, 91) 

iii. Once the stack is empty or input is €, change to q2. 
e. (41, €,2Z0, NOp, q2) 
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Thus, PDA = {{qo, 41,42}, {a,b}, {a, zo}, 5, go, Zo, 92}, where 6 is given by: 
Transition diagram: 


aa ba 
4 PY ache at pop 
oo) Push(a) 
VE 


(Arar 
pop nop 
= ——2 
Figure 11.15. Transition Diagram L = {a"b"|n > 1} 


PDA action for the input string: 
Consider the input string w = aaabbb, whose description is given by: 
Transition table: 


— 


Unread input Transition 
aaabbb - 


Stack | New State 
qo 


aabbb | (qo, a, Zo, push(a), go) 
abbb | (qo,4,a, push(a),go) | aazo qo 


bbb | (qo, a, a, push(a), go) | aaazo qo 
bb (qo, b, a, pop, qi) aazo 
(41,5, a, pop, 41) 


q1,b,a, pop, qi) q1 
(41, €, Zo, NOP, G2) Z0 q2 


Table 11.6 Transition table for L = {a"b"|n > 1} 


To show that aaabbb is accepted by PDA: 


ID: (qo, aaabbb, zo)|—(qo, aabbb, azo) 
/—(qo, abbb, aazo) 
t—(qo, bbb, aaazo) 
/—(q1, bb, aazo) 
t—(q1, 5, az0) 
-—(41, €, Zo) 
t—(42, Zo). 
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Since the final state is g2 and stack empty Zo is reached, it follows that w is accepted by 
PDA. 


EXAMPLE 11.6.4: Design a PDA to accept the language L = {w € {a,b}*|ng = np}. 


Design strategy: The PDA must operate in the following way. To start with, irrespective 
of what is there in input symbol, input symbol is pushed onto the stack. Next, if the input 
symbol is same as the top of the stack, the symbol is pushed onto stack, otherwise popped 
from stack. This process continues and if the automata reaches the end of input (€) or empty 
stack, then the input is of the form ng = np and L is accepted by the PDA. 


Following are the transitions required: 
i. Read each symbol of the input string, irrespective of what the input string is and push 
it onto stack. 


a. (40, 4, Zo, push(a), go). 
b. (qo, b, Zo, push(b), go). 
ii. Push onto stack the input symbol, if it is same as the top of the stack. 


c. (qo,a,a, push(a), qo) 
d. (go, 5, b, push(b), go) 


iii. If the input symbol is not the same as that of the stack top, then pop. 
€. (40,4, b, pop, go) 
f. (qo, b,a, pop, go). 
iv. Once the stack is empty or the input is €, change to q. 
8. (Go, €, Zo, Nop, q1) 
Thus, PDA = {{q0, 41}, {a, b}, {a, b, Zo}, 5, Go, 20, 91 1; where 6 is given by: 
Transition diagram: 


b, 2, 

—&% — push(d) 

push(a) aa 
a,b push(a) 
pop 

b,b 

ae push(b) 
pop 2 


Figure 11.16. Transition Diagram, L = {w € (a, b)*|nq = np} 
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PDA action for the input string: 
Consider the input string w = abbbabaa, whose description is given by: 


Transition table: 
[in| amen (ii 
Unread input Transition Stack | New state 
abbbabaa - Z0 qo 
bbbabaa | (qo, 4,20, push(a), qo) | _4Zo0 qo 
bbabaa (qo, b, a, pop, go) Zo qo 


babaa | (qo, 5, Zo, push(b), go) bzo qo 
abaa | (qo,b,b, push(), qo) | bbzo go 


baa | _(q0,a,b, pop, qo) bz0 go__| 
aa | (qo,b,b, push(b), go) | bbz0 90 
a|__(qo,a,b, pop, go) bz9 qo 


€ (qo, a, b, pop, go) Z0 90 
-| G@o,€,z0,n0p,qi1) | zo] a | 


Table 11.7 Transition table for L = {w € {a, b}*|ng = np} 


To show that w = abbbabaa is accepted by PDA: 

ID: (qo, abbbabaa, z9)|—(qo, bbbabaa, azo) 
+-—(qgo, bbabaa, zo) 
|—(qo, babaa, bzo) 
}-—(qo, abaa, bbzq) 
+—(qo, baa, bzo) 
t—(qo, aa, bbzo) 
L—(qo, 4, bzo) 
-—(qo, €, 20) 
t—(q1, 20). 


Since the final state is gq; and the stack empty Zo is reached, it follows that w is accepted by 
PDA. 


EXAMPLE 11.6.5: Design a PDA to accept the language of nested, balanced parentheses. 


Design strategy: 
The PDA must operate in the following way: 


@ Push all the symbols of type ‘( onto the stack. 
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m If the input symbol of type ‘)’ is encountered, then remove all the symbols from the 
stack until the stack becomes empty. 


Following are the transitions required: 


i. To read each symbol of the input string of type ‘(’ and push onto stack, until ‘)’ is 
encountered. 
a. (qo, (, 20, push( (), go) 


b. (o, (, (, push((), go) 
ii. Once b is encountered, change to q; by performing pop operations. 
c. (go, ); (, pop, 41) 


d. (41, ), G pop, qi) 
iii. Ifthe input is €, change to final state qo. 

e. (41, €, 20, Nop, g2) 
Thus, PDA = {{q0, 41,92}; {(.)}, {G 20}, 5, Go, Zo, 92}, where 5 is given by: 
Transition diagram: 


Gz, G( co 
push(() Push(() 


pop nop 
—+*@©) ©) —© 
Figure 11.17. Transition Diagram to Accept Balanced Parenthesis 


PDA action for the input string 
To show that w = ((())) is accepted by PDA: 


ID: (qo, ((O)), 20) —(@o, (Q)), (Zo) 
t—(go, ()), (Zo) 
t—(qo, ))), (((Zo) 
t—(q1, )), (zo) 
t—(q1, ), (Zo) 
t—(q1, €, 20) 
t+—(q2, Zo) 


Since the final state is go and the stack empty zo is reached, it follows that w is accepted by 
PDA. 
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EXAMPLE 11.6.6: Design a PDA to accept L = {w|w € (a, b)*}, such that; 


i. Ng(w) > np(w) and 
ii. Ng(w) < np(w) 


Design strategy: The PDA must operate in the following way. To start with, irrespective 
of what is there in the input symbol, the input symbol is pushed onto the stack. Next, if the 
input symbol is same as the top of the stack, then the symbol is pushed onto stack, otherwise 
pop from stack. This process is continued. 


i. To show that PDA accepts more number of a’s than b’s i.e ng(w) > np(w). 


If the automaton reaches the form of ‘end of input (€)’, and if the top of the stack 
contains atleast one a, then change to state qg; and perform no operation. 
Following are the transitions required: 
a. Read each symbol of the input string, irrespective of what the input string is and 
push it onto stack. 
B (40,4, Zo, push(a), qo) 
® (40,5, zo, push(d), go) 
b. Push the input symbol onto stack, if it is same as the top of the stack. 
@ (40,4, a, push(a), go) 
@ (qo, b, b, push(b), qo) 
c. Ifthe input symbol is not the same as that of stack top, then pop. 
B (q0,4, b, pop, go) 
B (40, 5,4, pop, go). 
d. If the automata reaches the form of ‘end of input’ and if the stack top contains 
atleast one a, then change to state q}. 
@ (40, €, 4, nop, 41). 
Thus, PDA for L = {w|w € (a,b)*|ng(w) > np(w)} is 


PDA = {{q0,41, {a,b}, {a, b, 20), 5, 40, 20, 91}. 


ii. To show that PDA accepts more number of b’s than a’s i.e., ng(w) < np(w). 
If the automata reaches the form of ‘end of input (€)’, and if the top of the stack 
contains atleast one b, then change to state q; and perform no operation. 


The transitions required are: 
a. Read each symbol of the input string, irrespective of what the input string is and 
push onto stack. 
® (40,4, Zo, push(a), go). 
™ (qo, 5, zo, push(d), go). 
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b. Push the input symbol onto stack, if it is same as the top of the stack. 


@ (40,4, a, push(a), go). 
B (40,5, b, push(b), go). 
c. Ifthe input symbol is not the same as that of stack top, then POP. 


@ (go, 4, b, pop, qo). 
@ (go, b, a, pop, qo). 
d. Ifthe automata reaches the form of ‘end of input’ and if stact top contains atleast 
one b, then change to state q1. 


@ (qo, €, 5, nop, 1). 
Thus, PDA for L = {w|w € (a, b)*|ng(w) < np(w)} is 


PDA = {{qo, 41}; {a, b}, {a, b, zo}, 6, go, zo, 41}. 


11.7 Determinism and Nondeterminism 


Based on the processing of input string by the machine, a PDA can be: 


w Deterministic PDA 
@ Nondeterministic PDA 


11.7.1 Deterministic PDA 


A PDA is deterministic, if each input string can only be processed by the machine in only 
one way, i.e., for the same input symbol and same stack symbol, there must be only one 
choice. 


Formally, a PDA P = (Q, 2,1, 4, qo, Z0, F) is deterministic if 


i. 5(q,a,z) has only one element 


ii. 5(g, €,z) is not empty, then 5(q, a, z) should be empty. 


If conditions (i) and (ii) are satisfied, then the PDA is deterministic, otherwise PDA is 
nondeterministic. 


EXAMPLE 11.7.1: A PDA for simple nested parentheses strings is deterministic L() = { € 
0,00, (0), (O)), 000. ++: }- 
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(,z| push (() ), (| pop 


& ), (| pop a 


Figure 11.18. Transition Diagram to Demonstrate Deterministic PDA 


In the above diagram, it is clear that for each input string machine processes in one way. 
For example, 5(qo, (, Z0) = (go, () or (go. (¢ Zo, Push((), go). 


EXAMPLE 11.7.2: The PDA for L = {a"b"|n > 1} is deterministic. 


11.7.2 Nondeterministic PDA 
A PDA is nondeterministic, if there is some string that can be processed by it in more than 
one way. 

There are two types of nondeterministic PDA that may occur (two types of moves): 


a. First kind of nondeterminism occurs, when a state emits two or more edges labelled 
with the same input symbol and same stack symbol. 


Formally, for the same input symbol and same stack symbol there exists number of 
choices. This is represented as: , 


5(q, a, Zz) = {(p1,01), (p2, a2), sees (Pn; Qn)} where 


pi and q are states, a € X,z is a stack symbol and a; € T*. 


EXAMPLE 11.7.3: First kind of non-determinism 


start 


Figure 11.19. Transition Diagram to Demonstrate Nondeterminism for a State that Emits 
Two Edges Labelled with Same Input and Same State Symbol 
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Pushdown Automata 
In the above figure, it is seen that for the same input ‘a’ and for the same stack top ‘b’, there 
are two choices: 


@ 5(q1,a,b) = (g2,x) or (qi, 4, b, push(x), q2) 

m 5(q1,4,b) = (43, y) or (91,4, b, push(y), g3) 

b. Second kind of nondeterminism occurs, when a state emits two edges labelled with the 
same stack symbol, where one input symbol is ‘e’ and the other input symbol is not. 


Formally, 6(q1 €, z) = {(@1,@1), (p2, 2), aoe, | (Pn, On)} 


and 5(q,a,z) = {(p1,01), (p2,@2),..-, (Pas On)}. 


EXAMPLE 11.7.4: Second kind of non-determinism 


sans (a) 
act Bets) er 


€,%, 
Figure 11.20. Transition Diagram to Demonstrate Non-determinism for a State that Emits 


Two Edges, One Labelled with ‘a’ and the Other with €, with the Same 
Stack Symbol 


In the above figure, for the same stack symbol, there are two choices based on the different 
inputs: 


™ (qo, 4, Zo, push(x), q1) 
@ (40, €, 20, NOp, q2) 
EXAMPLE 11.7.5: Examples of nondeterministic PDA 


a. Language L = {(ww*|w é€ (a,b)*} 
A language of PALINDROME with even length of words, that reads the same in forward 
as well as backward directions: 


PALINDROME = {e, aa, bb, aaaa, abba, baab, bbbb, ---} 
b. Language L = {w € {a, b}*|ng = np}. 
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Note-1: 


m for finite automata, nondeterminism does not increase the power of machines. 
@ for PDA’s nondeterminism does increase the power of machines. 


EXAMPLE 11.7.6: Show that the PDA, that accepts the language L = {w € {a,b}*|ng = np}, 
is nondeterministic. 


Solution: The transitions required to design a PDA for L = {w € {a, b}*|ng = np} 


i. (qo, 4, Zo, push(a), go) 
ii. (go, b, Zo, push(b), go) 
iii. (go, a, a, push(a), go) 
iv. (qo, b, b, push(d), go) 
v. (qo, 4, b, pop, go) 
vi. (qo, b,a, pop, go) 
vii. (qo, €, Z0, NOp, 1) 


To be nondeterministic, the PDA must satisfy any one of the following two conditions: 


Condition-1: For the same input symbol and the same stack symbol, there exists number 
of choices, 


ie. 8(q,a,Z) = {(P1,01),.-. Dn, n)}. 
Since there is only one transition for the same input symbol and the same 
stack symbol, condition-1 is not satisfied. 
Condition-2: For the same stack symbol, where one input symbol is “e’ and the other input 
symbol is not, there exist the following conditions: 


6(q, €, z) = {(P1, 1), eee (Pn, an)} 


and 45(q,a,z) = {(p1,1),..- (ns @n)}. 
The given PDA has the following transition for the same stack symbol, with 
only one input symbol as ‘e’: 
(go, €, Zo, Nop, 41) 

(go, 4, Zo, push(a), go) 

(qo, b, Zo, push(b), go) 
Hence condition-2 is satisfied and the given PDA 

L = {w € {a, b}*|ng = np} 


is nondeterministic. 
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EXAMPLE 11.7.7: Show that the PDA, that accepts the Language L = {ww*|w € (a+b)*}, 
is nondeterministic. 


Solution: The transitions required to design a PDA for L = {ww*® |w € (a + b)*} are: 


i. (Go, € Zo, NOP, go) 
ii. (go, 0, zo, push(0), go) 
iii. (go, 1, zo, push(1), go) 
iv. (go,0,0, push(0), go) 
v. (go, 1,1, push(1), go) 
vi. (go, 1,0, push(1), go) 
vii. (go,0, 1, push(0), go) 
viii. (go, €,0, nop, q1) 
ix. (go, €, 1, nop, q1) 
X. (qo, €, Z0, NOp, q1) 
xi. (q1,0,0, nop, q1) 
xii. (qi, 1,1, pop, q1) 
xiii. (g1, €, Zo, NOP, g2) 


To be nondeterministic, the PDA must satisfy any one of the following two conditions: 


Condition 1: For the same input symbol and the same stack symbol, there exists a number 
of choices 


ie., 5(q,a,z) = {(P1,01)--- (Pn, an)}. 


For the same input symbol and the same stack symbol, the given PDA has 
the following: 

The transitions (viii) and (ix) are used by the PDA to change over from state 
qo to qi, once the first symbol of the string w® is encountered (which could 
be either 0 or 1). Thus, the transitions can also be written as 


(qo,0,0,nop,gi)__... (viii*) 
(qo,1,1,nop,qi) ... (ix*). 
Now from transitions (iv) and (viii*), it is clear that 
5(go,0,0) = { (go, 00), (qi, nop)}. 
From transitions (v) and (ix*), it is seen that 
5(go, 1,1) = {(@o, 11), (qi, nop)}. 
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Thus, for the same input symbol and the same stack symbol, there are number of choices. 
Hence condition-1 is satisfied and the given PDA is nondeterministic. 


EXAMPLE 11.7.8: Show that the PDA, that accepts the language L = {a"b"|n > 1}, is 
deterministic. 


Solution: The transitions required to design a PDA for L = {a"b"|n > 1) are: 
i. (qo, 4, Zo, push(a), go) 
ii. (go, a,a, push(a), go) 
iii. (go, b,a, pop, 91) 
iv. (q1,b,a, pop, q1) 
Vv. (q1, €, Zo, NOp, g2) 
To be deterministic, the PDA must satisfy the following two conditions: 


Condition 1: For the same input symbol and the same stack symbol, there must be only 
one choice, 


ie. 5(g,a,Z) = (p, a). 


Since there is only one transition for the same input symbol and the same 
stack symbol for the given PDA, condition-1 is satisfied. 


Condition-2: For the same stack symbol, where one input symbol is € and the other input 
symbol is not, it is observed that 


5(q, €,Z) = (p,@) is defined, 
and 4(q,a,z) = (p,q@) is not defined. 


For the given PDA, (q1, €, Zo, nop, gz) is defined and (q1, a, Zo, nop, q2) is 
not defined. 


Hence, both condition-1 and condition-2 are satisfied, from which it follows that the given 
PDA, L = {a"b"|n > 1} is deterministic. 


EXAMPLE 11.7.9: Show that the PDA, that accepts the language consisting of balanced 
parentheses, is deterministic. 
Solution: The transitions required to design a PDA for balanced parentheses are 


i. (qo, (,Z0, push((), qo) 
ii. (go, (, ( push((), go) 
iii. (go, ), G pop, 41) 
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iv. (q1,),( pop, 41) 
Vv. (41, €, Z0, nop, 42) 
To be deterministic, the PDA must satisfy the following two conditions: 


Condition-1: For the same input symbol and the same stack symbol, there must be only 
one choice. 


i.e., 5(g,a,Zz) = (—p,@). 
Since there is only one transition for the same input symbol and the same 


stack symbol, condition-1 is satisfied for the given PDA. 


Condition-2: For the same stack symbol, where one input symbol is € and other input 
symbol is not, 


5(q, €, Zz) = (p, a) is defined 
and 6(q,a,z) = (p,q) is not defined. 


For the given PDA, (q1, €, Zo, nop, gz) is defined and (q1, a, Zo, nop, gz) is 
not defined. 


Hence both condition-1 and condition-2 are satisfied and the given PDA is 
deterministic. 


11.7.3. Demonstration of Deterministic PDA 


APDA for simple 
nested parenthesis strings Input 
Le | 


Input Stack Stack 


Q" 
start_ 5 ),/e (7) end 
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Input 


' 


(, &/ ( ee 
start 3 )/e (4) Gnd) 
h: 


Figure 11.21.a-h: Demonstration of Deterministic PDA for Simple, Nested Parentheses 
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11.7.4 Demonstration of Nondeterministic PDA 


NPDA M for Input 
UM) = (ws, =1,) FORE Bo 
a,z,/0z, b, z,/1z, 
a,0/00 51/11 a, 2/02, b, z/1z, Stack: 


a,0/00 b/11 


ae 0 i 
—@—* 


Q: 


a,l/e b,0/e 


€, %/Z 


P: 


Input Input 
ABBBee 2. Kd a ba had Ba 
; a ee a, z,/0z, 6, z,/1z, = 


a,0/00 61/11 


a,l/e 
gt 


, oe 


a,l/e b,0/e 


4, ©, 2/2 


R: 


Input Input 


Le} o] | a} a} al je} +] 4] 4] 4 4! 
a, z,/0z, Gz /lz> Stack a, z,/0z, 5, z,/12, 
a a “ie ae ‘a 


ears 


——> Y, 


T: 


q, E, UZ 


U: 


Input Input 
La] >] of o} af a] le [> [e [fate 


a, z,/0z, b, z,/1z, 4, 2/02, , %/1z, 


vi a 
\) Es 2% ns 


V: 
Input 


Stack 


a, z,/0z, 5, z,/1z, 


a,0/00 6, 1/11 
a,l/e b,0/e 


) accept 
a) > _.© 

Figure 11.22.P-X: Demonstration of Non-Deterministic PDA for L(M) = {w : ng = np} 
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11.8 Equivalence of PDA and CFL 


The context-free grammar G and pushdown automata P are said to be equivalent iff 
L(G) = L(P). 
In other words, equivalence of PDA and CFL means: 


a. the class of languages accepted by context-free grammar is exactly the same as the class 
of languages accepted by PDA i.e., it is possible to convert any context-free grammar 
to PDA, such that, L(G) = L(P) 

b. the language accepted by PDA is exactly the same as the language accepted by context- 
free grammar, i.e., it is possible to convert any PDA to context-free grammar, such 
that, L(P) = L(G). 


11.8.1 Conversion from Context-Free Grammar to PDA 


Procedure: Let G = (v, ,p,s) be the context-free grammar. Consider the PDA P = 
(Q, u,T, 5, go, Zo, F’), where qo is the start state and zg is the initial stack symbol. Conversion 
from ‘G’ to P needs the following steps: 


Step-1: Without consuming any input, change the state to q; and place the start symbol 
of G onto stack. The transition defined is: 
5(qo, €, 20) = (q1, Szo). 


If there is a production of the form A — a «, then the corresponding transition 
is: 


5(q1,@,A) = (q1,). 


In state qi, on encountering the end of the input, if zo is present as the stack top, 
then change the state to q2 which is the final state and do not alter the contents 
of the stack. The transition defined is: 


5(q1, €,20) = (q2, Zo). 


Note-2: Apply the above steps, only if the grammar is in GNF, otherwise the grammar 
G has to be first converted into GNF. 
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EXAMPLE 11.8.1: Construct a PDA for the grammar 


S— aAA 
A — aS|bS|a. 


Solution: The grammar 


S— aAA 


A aS 
A— bS 


A~>a 


is in GNF. 
The equivalent PDA, P = (Q, 2,T, 5, qo, zo, F) where 


Q _ {90, 91,92}, L= {a, b}, r= {S,A, zo}. 


‘5’ is defined as follows: 
m Push the starting symbol ‘S’ onto the stack 


5(go, €, 20) = (g1,5z0). 


Grammar PDA 
S — aAA | 5(q1,4,S) = (qi, AA) 


A->aS | 8(q1,a,A) = (q,58) 
A— bS | 5(qi,b,A) = (qi,S) 


A-a 5(q1, 4, A) = (91, €) 
m Change q to final state q2 


5(q1, €, 20) = (42, 20). 
EXAMPLE 11.8.2: Construct a PDA for the grammar 


S—aA 
A — aABD|bB\a 
B-—b 
D— d. 
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Solution: The grammar 
S—>aA 


A — aABD 
A— bB 
Aa 
Bb 
D-d 

is in GNF. 

The equivalent PDA, P = (Q, 2,1’, 5, go, zo, F) where 


Q = {90.91.92}, U={a,b,d}, YT ={S,A,B,D, zo}. 
6 is defined as follows: 
@ Push, the starting symbol ‘S’ onto the stack 


5(qo, €, 20) = (1,520). 


Grammar PDA 

S— aA 5(q1,4,S) = (41,4) 

A — aABD | 5(q1,4a,A) = (q1, ABD) 
A-—> bB 5(q1,b, A) = (q1, B) 
Aa 5(q1,4,A) = (91, €) 
Bb 5(q1,5, B) = (41, €) 
D-d 5(q1,4,D) = (41, €) 


m Change q; to the final state q2: 
5(41, €, 20) = (42,20). 
EXAMPLE 11.8.3: Construct the PDA for the grammar 
S — aSbbla. 
Solution: The given grammar is not in GNF. The GNF of this grammar is 
S — aSAla 
A— bB 
B- b. 
The equivalent PDA P = (Q, 2,T’, 5, go, Zo, F) where 
Q = {90,491,492}, 2 = {a,b}, F ={S,A,B, Zo}. 
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5 is defined as 
m@ Push the starting symbol ‘S’ onto stack 


5(qo, €, 20) = (q1, SZ0). 


Grammar | PDA 
S — aSA | 8(q1,a,S) = (qi, SA) 


Soa 5(q1,a,S) = (qi, €) 
A— bB | 8(q1,b,A) = (q1,B) 
Bob 5(q1,5, B) = (41, €) 


m Change q; to the final state g2, so that 


5(q41, € Zo) = (q2, Z0). 


11.9 Exercises 


What is a PDA? Explain with an example. 
Explain the components of PDA, using a neat diagram. 
Explain why finite automata are less powerful than PDAs. 
Explain how PDAs are represented using transition diagrams. 
Present the formal definition of PDA, with examples. 
Formally define the concepts of string and language acceptance for PDAs. 
Design PDAs to accept the following languages over & = {0, 1}: 
a. 0* 
b. {0'1'0/V|i,j > 0} 
c. {071i > 1} 
d. {0"1"|m 4 n} 
8. Obtain the ID of a PDA to accept the language of balanced parentheses, for the following 
inputs: 
a. € 
» (© 
- ()) 
» (QO)) 


SU or Mors 


ao oo 
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9. Design a PDA to accept the following languages: 
a. {xx|x € {0,1}*} 
b. {xlx © (0, 1}* and x = x*} 
c. {01"|3n < m < 7n} 
10. Define a deterministic and nondeterministic PDA with examples. 
11. Show that the PDA L = {a"b*"|n > 1} is deterministic. 


12. Consider a PDA (M) with n states and m input alphabet symbols. What is the maximum 
possible number of rejecting conputations, that m could have on an input of length k? 


13. Prove that, for a context-free grammar G, there is an equivalent pushdown automata P 
such that L(G) = L(P). 


14. For the grammar 
S — aABC 
A— aBla 
B — bA\|b 
C-> a, 
obtain the corresponding PDA. 
15. Obtain the PDA for the CFG given below: 
S — aABB\aAA 
A — aBBla 
B— bBBI|A 


C—a. 
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Pumping Lemma 


‘Pumping the energy into the system regularises the activites, 
accepts the challenges, extends the benefits.’ 


Introduction 


In this chapter, the concept of Pumping Lemma and its applications are discussed in detail. 
This concept is an important one, which introduces certain novel features into the subject. 
The study and analysis is based on the following discussion: 

Suppose an automaton, over an alphabet X, has k states and suppose w = 
@1,42,43,...,An iS a word over & accepted by M, such that, |w] = n > k. Let 
P = (50, 51,..-,5n) be the corresponding sequence of states determined by the word w. 
The condition that n > k suggests that two of the states in p must be equal, say s; = 5; 
where i < j. Further, let 


X= 44,02°*+Qj, Y=Aiy1-++Aj, Z=Ajz1-++ Ap. 


Now, clearly from the figure 12.1, xy ends in s; = sj; hence xy” also ends in s;. In other 
words, for every m, Wm = xyz ends in s,, which is clearly an accepting state. 


y 
O--@— 
Figure 12.1. 


The above discussion leads to the following important result, the details of which are 
discussed in the subsequent sections. 


Suppose M is an automaton over &, such that: 


a. M has k states 
b. M accepts a word w from 2 where |w| > k. 


Then w = xyz where, for every positive m, wm = xyz is accepted by M. 
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12.1 Introduction to Non-Regular Languages 


Consider a finite automata having 5 states, as shown below: 


Figure 12.2. A Finite Automaton with 5 States 


Let us process the string ababbaa on the FA: 


a b a b b a a 
90> WU > 93> 92> 94> 93> GI. 1. 
Since q; is the final state, the string ababbaa is accepted. In general, 


a. we always start from the initial state. 
b. after reading the first letter of the input string 
i. we may go to another state or return to the initial state, 
ii. the maximum number of different states, that can be visited after reading the first 
letter, is 2. 
c. after reading the first two letters of the input string, the maximum number of different 
states that can be visited, is 3. 
d. after reading the first m letters of the input string, the maximum number of different 
states visited would be (m + 1). 


Thus, with the string ababbaa, after reading the 5 letters, the maximum number of different 
states that is visited is 5 + 1 = 6. However, since FA has only 5 states, this means after 
reading the 5 letters, there is a state that is visited twice. 

Consider the string aaabaa: 


a. The string length is 6, which is more than the number of states in the above FA. 
b. Let us process the string aaabaa on FA: 


a a a b a a 
90> 41 > 90> U1 > 93> 27> | 


Since q; is the final state, so the string is accepted. 
c. Here, the state go is visited twice. 
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Thus, in general, for a FA with N states, with a string w of length |w| > N, there exists 
atleast one state that is visited atleast twice. 


(i) Let u be the state that is visited twice. 
(ii) Break up the string w as w = xyz where x, y and z are 3 strings such that 
™@ string x has those letters that are at the beginning of w and are read by the FA until 
the state u is hit for the first time. 
@ string y has those letters used by FA, starting from the time we are in the state u, 
to the time the state u is hit for the second time. 
@ string z contains the rest of the letters in w. 


For the string w = ababbaa processed on the above FA, 
u=qi, x=ab, y=abb and z=aa. 
For the string w = aaabaa processed on the above FA, 
u=qo, X=€, y=aa, z=abaa. 
Now, consider a language L = {€, ab, aabb, aaabbb, ...,} i.e., 
Le=ita't’ -n=0,1,2,..} 
which is regular. 


The following are the observations made: 


a. Consider the FA (figure 12.3) with 5 finite states and w = a®°b°. 
b. The first 6 letters of the word are a’s. While processing these six letters, the FA visits 
some state u atleast twice, since there are only 5 states in the FA. 


Figure 12.3. A Finite Automaton with 5 States 


c. We can say that the path has a circuit @) @ and @, which consists of edges that are 
taken from the time u is visited for the first time to the time corresponding to the next 
visit to u. 

d. After the first b is read, the path goes elsewhere and eventually ends up in the final state, 
where w = a°b® is accepted. 

e. Now consider the string w = a°+3b°. While processing this string, we again end up in 
the final state and hence w = a°+3p° is also accepted. 
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f. But a®+3p° is not in L(a® (a?)*b®, k > 0), since it does not have an equal number of a’s 
and b’s. 
g. Thus L is not a regular language, which means that it is a non-regular language. 


Definition: 


a. A language that cannot be defined by a regular expression is called a non-regular: 
language. 
b. The languages which are not regular are called non-regular languages. 


12.1.1 Examples of Non-Regular Languages 


a. L={a"b":n>0} 

b. L= {ww : we x*} 

c. L= {a"b'c"+! :n,1 > 0} 

d. L={a":n>0} 

e. L={a':i’} 

f. L={O" :n>0} 

g L={Ol2e:0<i<j<k 


12.1.2 Pumping Lemma 


The fundamental tool for proving that a language is 


@ not regular 
@ not context-free 


is known as the Pumping Lemma. 


12.1.3 The Principle of Pumping Lemma 


Pumping Lemma is based on the principle of ‘Pigeonhole’, which states that if ‘n’ pigeons are 
placed in ‘m’ holes, and ifn > m, then atleast one hole must have more than one pigeon in it. 


12.2 Pumping Lemma for Regular Languages 


Pumping Lemma for regular languages is used to recognise all non-regular languages. It 
gives a necessary condition for an input string to belong to a regular set and also states a 
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method of pumping (generating) many input strings from a given string, such that all of 
them are in the language, if the language is regular. 

Pumping Lemma connot be used to establish that a given language is regular, but it can 
be used to prove that a language is not regular by showing that the language does not obey 
the lemma. 


12.2.1 Pigeonhole Principle 


Pigeonhole Principle of Lemma for regular languages states that ‘in a transition diagram 
with n states, any string of length greater than or equal to n must repeat some state’. 


Theorem I: Pumping Lemma for RL’s 


If L is a regualr set, accepted by some finite automaton D (with n number of states), with 
a string w in L written as w = xyz, then 


a. |y| 21 
b. Ixyl <n 
c. xyz € L Vi > 0, where y! denotes y repeated i times and y® =e. 


In other words, any sufficiently long string accepted by a finite automaton, can be broken 
into three parts (x, y and z) in such a way that an arbitrary number of repetitions of the 
middle part (y) yields another string in L. In that case, we say that the middle substring 
is pumped and hence the name, Pumping Lemma. 


Proof. Let a language accepted by the DFA 
D = (Q, &, 4,40, F) 
with n number of states be regular. Consider an input string w of length m, with m > n: 
W = 4),42,...,@m, Wherem2>n. 


Let 

5(qgo, 41, 42,...@i) = Gi. 
Then it is not possible for each of the n + 1 states go, qi, .. . gn to be distinct, since there 
are only n different states. This means that there must be atleast two states in Q, which 


must coincide. 
Thus, there are two integers j and k with 0 <j < k <n, such that 


qj = 4k. 
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The string w can now be written as 
W = Q1,Q2,.. - Qj, Gj+1, Gj+2,- -.- ak, Qk+1,4k42,-..a 
Thus w = xyz, 
where x = a1, a2,...4; 
Y = Gj+1,4j42,--- 
Z = Ak41,4k42,+-- 


G+ lb G+25 oe Ay 


Q}, Az, ...a, 


abet sis * Ge 


—+@) ~~ 
Middle substring 
Figure 12.4. Path in the Transitions for the DFA, D. 
Since there is a path from go to gm that goes through q;, but not around the loop labelled 
4j+1,.-.ax, the input string 
Qj, 42,...4j,Ak41,4k42,..-Gm is in L(D). 
Consider 
5 (Go, 41,42, ..-Aj,Ak41,4k42,---Am) = 5(5(Go,a1,-.- Aj), Ak41,-+-Am) 
= 5(Gj, ak+1,.-- Am) 


= 4m 
=> xyz Ee L(D). 


The automaton starts from the initial state go, and with the string x, it reaches q;. Then 
with the string y, it comes back to g; again and finally with the string y’, the automaton 
will be in the same state q; 


ie. 8(g0,xy*) = qj 


5(go,xy°) = qj 


5(go,xy') = qj 


462 


Downloaded from https://www.cambridge.org/core. Stockholm University Library, on 06 Dec 2018 at 08:04:09, subject to the Cambridge Core terms of use, available at 
https://www.cambridge.org/core/terms. https://doi.org/10.1017/UPO09788175968363.013 


Pumping Lemma 


By introducing the string z, the automaton reaches the final state gyn, 1.e., 


xy'z € L(D). 


Hence proved. 


12.2.2 Illustration of Lemma for RL 


a. Consider a regular language L = (0)* ie L = {€,0,00,...}, over the alphabet {0, 1}, 
accepted by the DFA shown in figure 12.5. 


oo 
oO ; 
0 ©, 


Figure 12.5. Transition Diagram for the DFA accepting L = (0)* 


By choosing w = 000, we can have the splitting (one of the possible ways) as 
x=0,y=0 and z=e. 


We see that y #€ and xy'z = 00! € L (ie. y can be looped for several times). 
b. Consider a regular language L = (01)*, over the alphabet {0, 1}, accepted by the DFA 
shown in figure 12.6. 


Figure 12.6. Transition Diagram for the DFA accepting L = (01)* 


By choosing w = 010101, we can have the splitting as 
x=0,y=1010 and z=1. 
(Note that, choosing of y = 01 would also satisfy the conditions of Pumping Lemma). 


463 


Downloaded from https://www.cambridge.org/core. Stockholm University Library, on 06 Dec 2018 at 08:04:09, subject to the Cambridge Core terms of use, available at 
https://www.cambridge.org/core/terms. https://doi.org/10.1017/UP09788175968363.013 


A Textbook on Automata Theory 


12.2.3 Steps in Pumping Lemma to Prove that a given 
Language is Non-Regular 


Step-1: Assume that the language L is regular. 
Step-2: Consider the DFA which has ‘m’ states. 
Step-3: Choose w, such that, w € L with |w| > m. 
Step-4: Consider w = xyz such that: 

lyl| 21 


lxy|< mo 

show that xy'z ¢ L for i > 0. 
Step-5: Thus by picking i and showing that xy'z ¢ L, we arrive at a contradiction and 
conclude that the assumption in step-1 is false. Hence, the language is not regular. 


EXAMPLE 12.2.1: Show that the language L = {a"b" : n > 0} is not regular. 


0 
Solution: Using the Pumping Lemma for L = {a”b" : n > 0}, assume that L is a regular 


language. 
Let m be the integer in the Pumping Lemma. Pick a string 


w =a™b"™ such that w € L and |w| > m. 


We can write, w = ab” as 


a™b”™ = xyz. 
From the Pumping Lemma, it follows that: length [xy| < m, |y| > 1. Therefore, 
m m 
a"b™ =a...a a...a a...a b...b 
—— 
x y Zz 


Thus, we have 
xyz =a"b", yaa’, k>1. 


From the lemma: 
seed, Ve O12; 


w=a™.qi-*. pm 


Choose i = 0 
=>w=a"*.p™eL 


This is a contradiction. 
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Thus, the assumption that L is a regular language is not true. Hence, L is not regular. 


EXAMPLE 12.2.2: Show that the language L = {ww : w € 5*) is not regular. 


Solution: Using the Pumping Lemma for L = {ww® : w € D*}, assume that L is a regular 
language. 
Let m be the integer in the Pumping Lemma. Pick a string 


w=a™b” b'™a™, such that, w € L and |w| > m. 
Nee Ne ee 
We can write, w = a™bb"™a™ as 
a" b™b™a™ = xyz. 


From the Pumping Lemma, it follows that |xy| < m, |y| > 1. 


Therefore, 
m m m m 


ee a aaa ener 
a™b™b"a" =a...a a...a ...ab...bb...b G.. 
—— 


x y Zz 
y=a' andk > 1. 


Thus, we have xyz = abba", y = a*,k > 1. 
From the lemma: xy'z¢€L, WV i=0,1,2,... 


Choosing i = 0 
w=a™. ak pm . bq” 
oe at—*kpm .b™q™. ¢ L 
This is a contradiction. 
Thus, the assumption that L is a regular language is not true. Hence, L is not regular. 
EXAMPLE 12.2.3: Show that the language L = {a"b’c"*¥ : n,1 > 0} is not regular. 


Solution: Using the Lemma for L = {abe cnt! : n,l > O}, assume that L is a regular 
language. 
Let m be the integer in the Pumping Lemma. Pick a string 


w =a™b"c*™ such that w € L and |w| > m. 
We can write w = a™bc2™ as 


a" bc” = xyz. 
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From the Pumping Lemma, it follows that |xy| < m, |y| > 1. 
Therefore, 


— 
a"bh"?" =4G...4 a...4 a...ab...bC...cC...€ 
—— SS ee 
x y 4 
y =a‘ andk > 1. 
Thus, we have xyz = abc", y=a*, k>1. 
From the lemma: xy'z€L V i=0,1,2,... 
Choose i= 0. 


=> w=xyz 
w= aa *pmcm 
w= apne € LL. 
This is a contradiction. 
Thus, the assumption that L is regular language is not true. Hence, L is not regular. 


EXAMPLE 12.2.4: Show that the language L = {a™ : n > 0} is not regular. 
Solution: Using the Lemma for L = {a™ : n > 0}, assume that L is a regular language, 
where n! = 1,2,...(n—1)-n. 
Let m be the integer in the Pumping Lemma. Pick a string 
w=a™ such that weL and |w| >m. 


. ! 
We can write, w = a” as m! 
a = XyZ. 


From the Pumping Lemma, it follows that |xy| < m, and |y]| > 1. 


Therefore, m m!—m 


ey 
@''=@...4 4...a @...0Q...a0...a 
ee 


x y Zz 
y= a® and k > 1. 
Thus, we have xyz =a™, y=a*, k>1. 
From the Pumping Lemma: xy'z EL, V i=0,1,2,... 
Choose i = 2. 
> xyz eL 
> xy’z = xyyZz 
=> xyz aa™t ey for 1< K <m. 
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Since L = {a”™ : n > 0}, there exists a number P such that 
mi+tk=P! for l1<k<m. 
However for m > 1, 


somitk<m!+m<m!+m! < m!m+m! =mi(m +4 1) 
=(m+1)! 

=>m!+k<(m+1)! 

=> m!+k 4 P! for any P. 


Therefore, a”'+* € L. ButL = {a" :n>0}forl<k<m=>a™'t* ZL. 
This is a contradiction. 
Thus, the assumption that L is a regular language is not true. Hence L is not regular. 


EXAMPLE 12.2.5: Show that L = {a! : i prime} is not regular. 


Solution: Using the lemma for L = {a! : i prime}, assume that L is a regular language. 
Let m be the integer in Pumping Lemma. Pick a string w such that 


weéL and |w| >m. 
From the pumping Lemma, we write w = a” as 
a™ = xyz where |xy| < m and |y| > 1. 


We have xyiz EL, V i=0,1,2,... 
Accordingly, the length of w = xy‘t!z for k > m must be a prime number for each string 
of L. 


But Length (xy*+!z) = Length(xyy*z) 
= Length(xyz) + Length(y*) 
=k + k(length(y)) 
= k(1 + length(y)) 
(which is not a prime). 
=>w€éL 


This is a contradiction. 
Thus, the assumption that L is regular is not true. Hence, L is not regular. 
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EXAMPLE 12.2.6: Show that L = {w|ng(w) = np(w)} is not regular. 


Solution: Using the Pumping Lemma for L = {w|ng(w) : np(w)}, assume that L is a regular 
language. 
Let m be the integer in the Pumping Lemma. Pick a string w with 
w = a™b™ such that w € L and Length |w| = 2m > m. 


We can write, w = ab” as 


a"b” = xyz. 
From the Pumping Lemma, it follows that : Length |xy| < m, |y| > 1. 
Therefore, 
m m 
eine 
a"b™ =@...a a...a....a b...b 

—— 

x y Zz 


y=a* and k>1. 


Thus, we have xyz = ab”, y= a*, k > 1. 
From the lemma: xy'zeL V i=0,1,2,... 
w=a™.ai-*. pm, 
Choose i = 0 => number of a’s is less than number of b’s. 
i = 2 > more number of a’s than b’s. 


This is a contradiction. 
Thus, the assumption that L is a regular language is not true and hence, L is non-regular. 


EXAMPLE 12.2.7: Show that L = {ww|w é€ {a, b}*} is not regular. 
Solution: Using the Pumping Lemma for L = {ww|w € {a,b}*}, assume that L is regular 
language. 
Let m be the integer in the Pumping Lemma. Pick a string w, with 
w = a™b ab such that w € L and length |w| = 2m + 2 > m. 
We can write, w = ab ab as 
a"b ab = xyz. 
From the Pumping Lemma, it follows that : Length |xy| < m, |y| > 1. Therefore, 
fee 
a"ba"b=a...a a...a_...adba.ab 
Nm re a 
x y z 


y=a and k> 1. 
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Thus, we have xyz = a"b ab, y = ak, k > 1. From the lemma: xy'z Ee L, VY i= 
0,1,2,... 
w=a™(a*)'ba"b 


Choose i = 0, 
=>w=a"(a’)'ba"b ¢ L. 


This is a contradiction. Thus, the assumption that L is a regular is not true. Hence, L is not 
regular. 


12.3 Pumping Lemma for Context-Free Languages 


The Pumping Lemma for CFLs is useful for 


a. generating an infinite number of strings from a given sufficiently long string, and 
b. to prove that certain languages are not context-free. 


Theorem II: Pumping Lemma for CFL’s. 
Let L be any CFL, then there exists an integer m such that, for any string w € L with 
|w| > m, the string w can be split into five parts as w = uvxyz such that 


a. |vy| 21 
b. pxyl<m 
c. foralli>O uv'xy'z EL. 


Proof. Consider an infinite context-free gramar G with no unit productions and no 
€-productions. Consider a string w € L(G) with length(m), where 


m > (number of productions) x (largest right-side of productions). 


This suggests that some variable must be repeated in the derivation of w. 
Consider u, v, x, y, z: strings of terminals. 


with S —> uAz 
A — vAy 
A->x 


and w= uvxyz. 
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2 
1 
u ¥ Zz 
Last repeated 
Variable 
v Y y 


Repeated KB 


x 


Figure 12.7 The Tree of w 


The possible derivations of w are represented below: 


Proof. Now, the tree of w is: 


Figure 12.8 Derivations of w 
Following are the strings that can be generated: 
a. We know that S=>uAz, A=>vAy and A=>x. 
S = uAz => uxz 
JW = uv’xy®z. 


b. We know that S=5uAz, A=> vAy and A = x. 


S 3uAz > wAyz > UVXYZ 
“w= uv'xy’z. 


c. We know that SuAz, A=>vAy and A=>x. 
S = uAz SuvAyz me uvvAyyz ae UVVXYYZ 


ow =uvxyz. 
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d. We know thatS=>uAz, A=>vAy and A=>x. 
S=uAz > uvAyz =>uvvAyyyz => uvyvAyyyZ > uvvvxyyyz 
ow = uv xyz. 
In general, for productions S = uAz, A > vAy and A=>x, the string generated is 
S43 uAz > wAyz > uvvAyyz = 

* * 

=> uvwAyyyz >... 
* * 

=> uvvwy...vAy...yyyzZ => 

> uv... vxy...yyyz 

=> uv'xy! Za: 


Therefore, a string of the form uv'xy'z for i > 0 is generated by G. Further, uvxyz € L(G) 
implies that, uv‘xy'z € L(G) with |vxy| < mand |vy| > 1. 


Figure 12.9a___|vxy| < m, Since A is the Last Repeated Variable 


Figure 12.9b —_|vy| > J Since there are No Unit and € Productions 
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12.3.1 Illustration of the Lemma for CFL 


Consider an infinite CFL with productions, given as 


S — AB 
A — aBb 
B- Sb 
Bob. 


Consider a string w = abbabbbb. Now, the derivation of w is, 


S => AB => aBbB = abbB = abbSb 
=> abbABb => 
= abbaBbBb > 
=> abbabbBb 
=> abbabbbb. 


The derivation tree of w is: 


Repeated 


Figure 12.10. Main Tree 
Consider the subtree of the node B: 


B => Sb => ABb => aBbBb => aBbbb. 
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Figure 12.11. Subtree-1.0 


Further the derivation of this subtree is as follows: 


Subtree-1.0 


B= --- => aBbbb. 


Subtree-1.1 


Subtree 1.1 


Figure 12.12. Subtree 1.1 
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The main tree gets extended as: 


So. pods. abi) po [in 
— ee 
— 


Subtree 1.0 
Subtree 1.1 


Figure 12.13. The Extension of the Main Tree 


Further extension of the main tree is as follows: 


Figure 12.14. Extension of the Main Tree 
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Thus S >.> abbaa| B |bbbbbb => abbaa| B |bbbbbb. 
Therefore, abbaabbbbbbb is also generated by the grammar. 
Following are some of the strings that can be generated: 
We know that B > b,B => .. = aBbbb,S => .. => abbaBbbb. 


a So>.> abba| B bbb => abbaabbbb. Hence, w = abbaabbbb. 


b. S => .. = abb| a[B]bbb| > abb| al a[B]bbb |bbb | > abbaabbbbbbb. 
c. S=>.. => abb al B |b 


= abb| a (a)[B] (bbb) bbb 
=> abb| al (a) [B] (bbb) |bbb 


= abba(a)*b(bbb)*bbb. 
In general, the string generated is: 
S => .. = abbaBbbb 


=> abba(a)'B(bbb)'bbb. 
=> abba(a)'b(bbb)' bbb. 


Thus, abba(a)'b(bbb)' bbb is generated by G and the string w = abba(a)'b(bbb)' bbb (in 
five pieces) is given as u = abb, v = a(a)', x = b, y = (bbb)' and z = bbb. 


12.4 Identification of u,v,x,y and z from the Derivation Tree 


The five parts of the string u, v, x, y and z, from the derivation tree T, are given below: 


a. x is the yield of the subtree T” whose root is the lower node labelled A. 
b. v is the part of the yield of the tree T’, rooted above A upto the point where the yield 
of T” starts. 
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c. y is the right part of the yield of the tree T’. 
d. wis the yield of T upto the point where the yield of 7’ starts. 
e. z is the remaining part of the yield of T. 


S 


KK uy it v 9 x§ —— it yp HO 
Figure 12.15. Derivation Tree T Representing the Five Parts of the String 


EXAMPLE 12.4.1: Consider the following productions: 
S— PQ 


Derivation tree for the string abab 


Figure 12.16. The Derivation Tree for the String abab 
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Split of the tree into u, v, x, y and z 


Figure 12.17. Split of the Derivation Tree for the String abab 
x=b,v=ba,u=aandy=z=€ 


12.4.1 Non-context Free Languages 


Languages that are not context-free are called non-context free languages. Following are 
some of the non-context free languages. 


L= {a :n> 1} 

L = {a? : pis prime} 

L = {a"b"a" :n> 1} 

L = {a"b"a"b"|n,m > 1} 

L = {ss:s € {a,b}*} 

ae on : aie 1} 
={a' ine 

L={a"b" :n> 0}. 


remo anager f 


12.4.2 Step-by-Step Procedure to Prove that a Language is 
Non-context Free 


Step 1: Assume that the given language is context-free, so that the pumping lemma 
applies. 


Step 2: There exists an integer ‘m’, such that, for any string w € L |w| > m. 
Step 3: Split the string w into uvxyz, with length |vxy| < m and |vy| > 1. 
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Step 4: Pick ‘i’ so that uv'xy'z ¢ L. 


Step 5: Thus, the assumption that the given language is context-free is not true and 
hence the language is non-context free. 


EXAMPLE 12.4.2: Show that the language L = {a"b"c” : n > 0} is not context-free. 
Solution: Assume that L is context free and thus the pumping lemma applies. Let m be an 
integer in the pumping lemma. Pick a string w with 
w =a™b™c"™ such that, |w| > m. 
We can write w = abc" as 
a™b™c™ = uvxyz 
From the lemma, it follows that |vxy| < m and |vy| > 1. 


We examine all possible locations of the string vxy in w, for which the following cases 
are considered: 


Case-1: vxy is within a”. 
m m m 
eee, 
LT —_—_— 
aa a...aadabbb...bbb ccc...ccc 
Nee Ne eee!) eee ee! 
u vxy Zz 
vxy consists of only a’s 
m m m 
—$————— 
rent (meme ccemmemrome 
aa a...a aa bbb...bbb ccc...ccc 
ae 
u yxy Zz 
Repeating vxy 


m+k,k>1 m m 
en , Ga aN —— 
aa aaa...aaaaaa bbb...bbb ccc...ccc 
ee a eee! 
u v2xy2 Zz 
From the lemma, uv2xy*z € L. 
But, uv?xy?z = a™+*bc™ ¢ L. 
This is a contradiction. 
Case-2: vxy is within b”. 
m m m 
OT ODE. eo 
aaa...aaa bbb...bb bcc...ccc 
in see! Neen teen te 
u vxy Z 
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The analysis is similar to case 1. 
Thus uv?xy?z = a™b™tkc™ ¢ L. 
This is a contradiction. 


Case-3: vxy is within c”. 


m i m 


a 

—__ SS 
aaa...aaa bbb...bbb ‘ccc,...ccc cc 
| —S— 


u vxy 4 

The analysis is similar to case 1. 

Thus uv*xy2z = a™b"c™+* ¢ L. 

This is a contradiction. 

Case-4: vxy overlaps a” and b”. 
m di m 
nein —S——S 
aaa...aad bbb...bbb ccc...ccc 
ee en 
u vxy z 


The different possibilities are: 
Possibility-1: v contains only a, y contains only b. 
m m m 
aaa,...aad bbb...bbb ccc ...ccc 
ee TT 


u vxy 4 
Now (ki + k2 > 1): 
m+k, m+k2 m 
Gag <-aaaaa ‘bbbbbb ...bb tcc. cce 
u 


y2 xy? Zz 


From the pumping lemma uvxy?z € L. 
However ath . pmtko .o™ ¢ L, 


Possibility-2: v contains a and b, y contains only b. 

m m m 
aaa...aaa bbb...bbb ccc...ccc 
ee ee ree een eee! 

u vxy Zz 
Now (kj +k2 +k > 1): 
m ky ko m+k m 
a EN a a aR 
aaa ...aaaaaa bb ‘aa ‘bbbbbbbb ...bbbccc...ccc 
Se SN 


u v2xy2 Z 
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From the pumping lemma uv*xy’z € L. 
But, a™bh gk2pm+kom ¢ 1 
This is a contradiction. 
Possibility-3: v contains only a, y contains a, b. 
m ie m 
aaa...aad bbb...bbb ccc...ccc 
Se a 
u vxy v4 

The analysis is similar to that for possibility 2, yielding a contradiction. 


Case-5: vxy overlaps b” and c”. 


m m m 
te ee oe 
aaa...aaa bbb...bbb ccc....ccc 
———— a a ee 
u vxy 2; 
Three possibilities arise: 
Possibility-1: v contains only b, y contains only c. 
Possibility-2: v contains bc, y contains only c. 
Possibility-3: v contains only b, y contains b, c 
This results in a contradiction again. 


Thus, in all the cases, we arrive at a contradiction. Hence, our assumption that L is 
context-free is not true. Therefore, L is non-context free. 
EXAMPLE 12.4.3: Show that the language L = {a”|n > 0} is not context-free. 


Solution: Assume that L is context-free and thus the Pumping Lemma applies. Let m be 
an integer in the Pumping Lemma. Pick a string w with 


! 
w =a™ such that, |w| > m. 
. ! 
We can write, w = a™ as 
m! 


a” = uvxyZ. 


From the lemma, it follows that |vxy| < m and |vy| > 1. 
To examine all possible locations of the string vxy in w, we proceed as follows: 


| v=a,y=a®,1<kj +k <m. 
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Pumping Lemma 


a v=a,y=a®,1<kh+hkh <m. 
m! +k, +k, 


Consider a”'+* = uv?xy?z. 


Since 1 < k < m for m > 2, we have 
mi+k<m!+m 
< m!+m!m 
= m!\(1 + m) 
= (m+ 1)! 
=> m! <m!+k<(m+1)! 
m!-+-k 


=> a = uvxy"z é L. 


This is a contradiction. So, the language L = {a™ |n > 0} is not context-free. 


EXAMPLE 12.4.4: Show that L = {a b" In > 0} is not context-free. 


Solution: Assume that L is context-free and thus the Pumping Lemma applies. Let m be 
an integer in the Pumping Lemma. Pick a string w with 


2 
w =a™ b”™ such that, |w| > m. 


-, 2 
we can write w = a™ b” as 


2 
a” b™ = uvxyz. 


From the Lemma, it follows that |vxy| < m and |vy| > 1. We examine all possible 
locations of the string vxy in a” b™, by considering the most complicated case as follows: 


Case: v is ina” and y is in b”. 
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a veda, y= db», 1<ky tk. <m. 


mw v=al, y=be, 1 <ky +k <mandk £0, ky £0. 
m-k, m-k, 


u x Zz 
vy y? 
Consider, a —ky, pmko uv?xy°z, 
Since kj #0, k7 #O and 1 < kj +k. < m, we have 
(m — ka)? < (m— 1)? 

=m —2m+1 

<m — ki 

=> m? — ky # (m—ky)” 


2 
= a™ hymna = wxy9z € L. 


This is a contradiction. So, the language L = {a b"|n > 0} is not context-free. 


12.5 Exercises 


1. What is Pumping Lemma? Explain why it is used. 

2. State the pigeonhole principle of the lemma. 

3. Define non-regular languages with examples. 

4. State and prove the pumping lemma for regular languages. 
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5. Give the general procedure used in pumping lemma, for proving that certain languages 
are not regular. 
6. Show that L = {w|ng(w) < np(w)} is not regular. 
7. Show that L = {ww|w é€ {a, b}*} is not regular. 
8. Show that L = {(ab)"a*|n > k,k > 0} is not regular. 
9. Show that L = {a*|k > 0} is not regular. 
10. Show that L = {o7|i > 0} is not regular. 
11. Specify a language L C {0, 1}*, such that, L* is not regular. 
12. State and prove the pumping lemma for context-free languages. 
13. Show that L = {ww|w é {a, b}*} is not context-free. 
14. Show that L = {a"b"a"b™|n,m > 1} is not context-free. 
15. Show that L = {a”?|n > 1} is not context-free. 
16. L = {a"b"c>|m > 5} is a context-free language. Justify the answer. 
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Turing Machine 


‘Life needs positive extension which makes the system achieve higher transitions, 
understand the language of creation, and move towards excellence.’ 


Introduction 


In the previous chapters, we discussed two of the major approaches to the modelling of 
computation viz. the automata approach and the grammatical approach. Under automata 
approach, we discussed finite automata and pushdown automata while under the grammatical 
approach, we discussed context-free languages. 

Further, we made the following observations: 


a. A finite automaton computational model is computationally equivalent to a regular 
language model. 


b. A pushdown automaton model is computationally equivalent to a context-free 
language model. 


c. A pushdown automaton model is more powerful, when compared to a finite 
automaton model in the sense that every language accepted by a finite automaton is 
also recognised by the pushdown automaton. However, there are languages, viz. the 
language {a”b” : n € N}, which are recognised by pushdown automata but not by 
finite automata. 


d. There are languages, including the language {a"b"c” : n € N}, which are not accepted 
even by pushdown automata. 


This prompts us to introduce a more powerful model of automata approach, which 
recognises more languages than a pushdown automaton model, called Turing machine. It 
was first proposed by Alan Turing in 1936 and was designed to meet the following objectives: 


a. They should be automata, i.e., their construction and function should be in the same 
general spirit as the other computational models. 


b. They should be as simple as possible, to describe, to define formally and to reason 
about. 


c. They should be as general as possible, in terms of the computations they can carry 
out. 
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Turing Machine 


Languages accepted by 
Turing Machines 


a’b"c" Www 


Context-Free Languages 


Regular Languages 
a a‘b’ 


Figure 13. The Language Hierarchy 


Definition: 
There exists the following equivalent definitions for the concept of T.M: 


a. Turing machines, are simple abstract computational devices intended to help 
investigate the extent and limitations of what can be computed. 

b. A turing machine is a kind of state machine. At any time, the machine is in any one of 
the finite number of states. Instructions for a turing machine include the specification 
of conditions, under which the machine will make transitions from one state to 
other. 


Alan Mathison Turing (1912-1954) was a British mathematician and cryptographer. 
He went to King’s College, Cambridge in 1931 to study Mathematics. Turning 
graduated from Cambridge in Mathematics in 1934 and was a fellow at Kings for two 
years, during which period he wrote his now famous paper published in 1937—On 
Computable Numbers, with an Application to the Entscheidungs problem. 

Turing is considered to be one of the fathers of modern computer science. He 
provided an influential formalisation of the concept of algorithm and computation— 
the Turing machine. He formulated the now widely accepted ‘Turing’ version of the 
Church-Turing thesis, that is any practical computing model has either the equivalent 
or a subset of the capabilities of a Turing machine. During World War II, he was 
the director of the Naval Enigma Hut at Bletchley Park for sometime and remained 
as the chief cryptanalyst of the Naval Enigma effort, throughout the war. After the 
war, he designed one of the earliest electronic programmable digital computers at 
the National Physical Laboratory and, shortly thereafter, actually built another early 
machine at the University of Manchester. He also, amongst many other things, made 
significant and characteristically provocative contributions to the discussion “Can 
machines think?”. 
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13.1 Components of a Turing Machine 


A Turing machine is usually described as consisting of the following three components. 


m tape 
@ head 
@ control unit. 


Read 
Write head 
(movement in both direction) 


Control Unit 


Figure 13.1. Components of a Turing Machine 


a. TAPE 


A tape is divided into a sequence of numbered cells or squares, one next to other. 
Each cell contains a symbol from some finite alphabet. The alphabet contains a blank 
symbol (B) and one or more other symbols. The set of symbols of the tape is denoted 
by I’. The tape is assumed to be arbitrarily extensible to the left as well as to the 
right. This implies that the Turing machine is always supplied with as much tape as 


it needs for its computation. Cells that have not been written before are assumed to 
be filled with the blank symbol. 
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Blank Symbol 


No boundaries-infinite leng 


celli celli+1 cell i+2 sae cellit+n 
() Read-write head 


Figure 13.2. The Tape with Cell i Indicating the Start Cell for the Current Computation 


b. HEAD 

m@ A tape head, is always stationed at one of the tape cells and provides 
communication for the interaction between the tape and the control unit. 

m@ Ina single step, a tape head reads the contents of a cell on the tape (reads a 
symbol), replaces it with some other character (writes a symbol) and repositions 
itself to the next cell to the right or to the left of the one it has just read or does 
not move (moves left or right or does not move). 

This course of action is called the move of a Turing machine. 

m At the beginning of the processing, the tape head always begins by reading the 
“input in cell i. The head can never move left from the cell i and if it is given an 
order to do so, the machine crashes. 


— Ss At each time step 


.1) Reads a symbol 
2) Write a symbol 
3) Moves left or Right or 


does not move... 
Read- Write head 


Figure 13.3. Head 


EXAMPLE 13.1.1: 


input string 


Head starts at the leftmost 
position of the input string 


Figure 13.4. Tape at Time-0 
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Head 


Figure 13.5. Tape at time-1 After the Action: 1. Readsa 2. Writes K 3. Moves Right 


a iB 


BLK IR 


lHead 


Figure 13.6. Tape at Time-2 After the Action: I. Reads b 2. Writes R 3. Moves Right 


c. Control Unit 


The reading from the tape or writing into the tape is determined by the control unit. It 
contains a finite set of states Q. The states are categorised into three viz., 


(i) The Initial State: Itis the state of control, just at the time when TM starts its operations. 
The initial state is denoted by go and go € Q. 

(ii) The Halt State: This is the state in which TM stops all further operations. The halt 
state is distinct from the initial state i.e., for a TM the halt and the initial states cannot 
be the same. The halt state is denoted by h and h C Q. There can be one or more halt 
states in a TM. 

(iii) Other states. 


13.1.1 Tape and Head of a FA/PDA Vs. Tape and Head of aTM 


The following are the differences in the roles of the tape and the tape head of a FA/PDA and 
the tape and head of a TM: 


a. The cells of the tape of a FA or a PDA are only read/scanned but are never 
changed/written into, whereas the cells of the tape of a TM may be written also. 

b. The tape head of a FA or a PDA always moves from left to right. However, the tape 
head of a TM can move in both the directions. 


From the above two differences, it is clear that for a FA or a PDA, the information 
in the tape cells which is already scanned does not play any role in deciding the future 
moves of the automaton. On the other hand, in the case of a TM, the information contents 
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Turing Machine 


of all the cells (including the ones earlier scanned) play a role in deciding the future 
moves. 


13.1.2 Halt State ofa TM Vs. Set of Final States of a FA/PDA 


m= ATM on entering the halt state stops making moves and whatever string is there on 
the tape will be taken as the output, irrespective of whether the position of head is at 
the end or in the middle of the string on the tape. 


m@ Ifa FA/PDA enters a final state while scanning a symbol of the input tape, it can 
still go ahead with the repeated activities of moving to the right, scanning the symbol 
under the head and entering a new state etc. Further, the portion of a string from left to 
the symbol under the tape head is accepted, if the state is a final state, and is rejected 
if it is not. 


13.2 Description of a Turing Machine 


There are different ways to describe the task of a Turing machine: 


13.2.1 The Transition Diagram 


The turing machine can be represented using the transition diagram. For a directed graph, 
an arc going from a vertex (which corresponds to the state P) to the vertex that corresponds 
to the state g, and the also the edge label, can be represented in different forms as 
follows: 


Form-1: 


Read symbol (a) —> Write symbol (b), move Left(Z) or 
move Right(R) or No move (NV) 


(7) a—>b,L () 
——-> 


Figure 13.7. Edge Label Format 
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Form-2: 


Read symbol (a) / Write symbol (b), move Left(L) or move Right (R) 


or do not move (N) 


() a/b,R () 
—_—_——> 


Figure 13.8. Edge Label Format 


EXAMPLE 13.2.1: Consider the initial configuration of the tape, as shown in figure 13.9. 


q 


7\ (current state) 
Tape at time-} Tape at time-2 


Figure 13.9. 


The transition diagrams for this situation, by using the different edge label forms, are: 


Form 1: Form 2: 
Oa @ use. 


Figure 13.10. Transition Diagram Using the different Edge Label Forms 


13.2.2 5-Tuple Specification 


The action performed by a TM, from one state to another state, can be specified by using 
the 5-tuple: 


( State-1, Read Symbol, Write Symbol, L/R/N, State-2) 
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EXAMPLE: 5-tuple specification, for the action performed by the TM in figure 13.9, is: 


<@1,0,K,R,q.>. 


13.2.3 Transition Table 


The description of how a TM operates for a given set of symbols on the tape can be 
represented in a tabular format, called transition table or action table. In other words, 
the transition table describes the following for the given state and the symbol it currently 
reads: 


a. write a symbol 
b. move the head (left one step(L) or right one step(R) or no move (N)) 
c. assume the same or a new state, as prescribed. 


The transition tables can be represented in different forms as shown: 
Form-1: 


Current | Read Write Move Final 5-tuples 
State Symbol | Symbol | Tape State Specification 
a a ee Se es 


Table 13.1 Action table form-1 


Form-2: 
Current q1 ae Qn 
State 
Tape Write Move Next oo Write Move Next 
Symbol | Symbol | Tape State Symbol | Tape State 


Table 13.2 Action table form-2 
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Form-3: 


Tape Symbol 


States a1 a2 ae _ | an 


| 
<action> | | 
| 


| | 


where <action> = (next state, write symbol, move). 


Table 13.3 Action table form-3 


EXAMPLE 13.2.2: Consider a TM, whose task is described in the transition diagram shown 
in figure 13.11: 


0/B, R 


iN 


0/0, L (@) aero as (n) 


Figure 13.11. Transition Diagram 


The transition table, in different forms for the above TM is given below: 


. Current | Read Write Move Final 5-tuple Specification 

State Symbol [ Symbol | Tape State | 

90 B B R q2 (go, B, B,R, q2) 
q 0 0 R q2 (41,0, 0, R, q2) 
q1 1 B R qf (91, 1,B,R,q1) 
”q B B N h (q1,B, B,N,h) 
q2 0 0 L q2 (q2,0,0, L, q2) 
q2 1 {1 R q]1 (92, 1,1,R,q1) 

Table 13.4 Transition table in form-1 
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Tape Symbols 
States 0 l B 
- — < q2,B,R > 
<q,0,R>|<q,B,R> | <h,B,N > 
qQ2 <q27,0,L >| <q,1,R> | <h,B,N> 
Table 13.5 Transition table in form-3 
b. 
Current qo TT q2 
States 
Tape Write | Move; Next | Write | Move | Next!) Write | Move! Next 
Symbol | Symbol State | Symbol State | Symbol State 
0 ~ | _ - 0 R qQ2 0 L qQ2 
1 = = = B R qi |i R ql 
B B 


Table 13.6 Transition table in form-2 


13.3 Observations on TM 


a. No €-Transitions are allowed in a TM. 


2G 


Figure 13.12. Transition Diagram, €—> b, L is not Allowed 


. A Turing machine halts if there are no possible transitions to follow. 

. Incase the TM halts, we say that the word on the input tape is accepted by the TM. 
. InaTM, the halt states have no outgoing transitions. 

. Infinite loop in a TM: (Hanging in some states) 


onan et 


Because of the infinite loop: 
(i) the final state cannot be reached. 
(ii) the machine never halts. 
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» PEP EEE 


Os 


Figure 13.14. A 7M that accepts the word w = aaa 


— ye Not Allowed 


ss G)  —— > Allowed 


Figure 13.15. Transition Diagram Showing No Outgoing Transition for Halt State 


(iii) the input is not accepted. 


EXAMPLE 13.3.1: Consider a TM, whose action is described in figure 13.16: 


b— >b,L 
a—»a,R 


Cc. 
+(x) 24, © 


Figure 13.16. ATM at time 0 
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Time 1: 


Time 2: 


Time 3: 


Time 4: 


Time 5: 


infinite loop --- 


Figure 13.17. Infinite Loop 


13.4 Elements of TM 


ATM has the following seven characteristics: 


a. A finite set Q of states, qo, q1,°-° 5 Qn- 

b. A finite input alphabet of letters, }* = {a, b,...}. 

c. A finite alphabet I", of tape characters. The tape alphabet does not contain blank B, 
although a TM can write B onto its tape which is called erasing. 

d. A transition function 5, which tells how the machine goes from one step to the next 
ie., 6 describes the following to be performed for a given character scanned at the 
current state: 

@ what character to be written on the tape and, 
@ tape head movements (Left, Right, No move) or (L, R, N). 

e. An initial state go. 


A special symbol B indicating blank character. } does not include B. 
A set of halt states ‘h’. 


G2 rh 
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13.4.1 Ordered Seven-Tuple Specification of a TM 


Formally a turing machine M is represented as a 7-tuple 


M = (Q,%,T,6, qo, B, h) 


. Qis the finite set of states 
& is the finite set of non-blank symbols 
I’ is the set of tape characters 


. go € Qis the initial state 

. Bis the blank character 

. hE Qis the final state 

. 6 is the transition function of a Turing machine and is defined as Q x IF to 
QOxT x {L,R,N}. 


13.4.2 Transitions of aTM 


The transition of a turing machine is represented as 


5 (gi, ax) = (qj, 41.) 


for qi € Q, (ax,a,) € T and x is any one of the values ‘L’, ‘R’ and ‘N’. 

The meaning of 5(q;,ax) = (gj,41,k) is that, if g; is the current state of the TM and a, 
is the cell currently under the head, then TM writes a, in the cell currently under the head, 
enters the state q; and the head moves to the adjacent cell to the right, if the value of x is R. 
Otherwise, the head moves to the adjacent cell to the left, if the value of x is L and continues 
scanning the same cell, if the value of x is N. 


EXAMPLE 13.4.1: For the TM in figure 13.11, the transition functions are: 
a. 5(q0,B) = (42, B,R). 
This means that for go as the current state and B as the cell currently under the head, 
the TM writes B in the cell currently under the head, enters the state g2 and the head 
moves to the adjacent cell at right. Similarly, the other transitions of TM are: 


b. 8(q1,0) = (q2,0,R) 

c. 8(q1,1) = (q1,B,R) 

d. 6(q1, B) = (h, B,N) 

e. 5(q2,0) = (q2,0,L) 
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f. 6(q2, 1) = (qj, 1,R) 
g. 5(q2,B) = (h,B,N). 


13.5 Instantaneous Description of a TM 


The complete state of a TM, at any point during a computation, may be described by. 


a. the name of the state that in which the machine is 

b. the symbols on the tape and 

c. the cell that is currently being scanned. 

A description of these three data is called instantaneous description(ID) or configuration 
of a TM. A simple way to represent such a description is shown below: 


f, 


i 
(current state) 


Figure 13.18. Instantaneous Description: a, a2 q\ ba3 a4 


Formal Definition: An ID of a TM is a string xqy, where q is the current state, xy is the 
string made form the tape symbols I’. The head points to the first character of the substring 
y. The initial ID is denoted by gxy, where gq is the start state and the head points to the first 
symbol from left—x. The final ID is denoted by xyqB, where q ¢€ h is the final state and the 
head points to the blank character denoted by B. 


EXAMPLE 13.5.1: Consider a TM, whose action is described in the transition table shown 
in table 13.7: 


0 1 B 


— as (q2, B, R) 
(q2,0,R) | (qi,B,R) | (h,B,N) 
(q2,0,L) | (qi,1,R) | 4,B,N) 


Table 13.7 Transition table 
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The action of the TM with the string w = 1010 is shown in figure 13.19: 


Time 0: ..{ B] 1] of 1[ o] BI]. Time: -- 
4. 


t?) 


ID: ¢,B1010B 


Time 2: Time 3: « 


Time 4: Time 5: -- 


Time 6: 


ID: B1010Bh 


Figure 13.19. ID of the TM for the String w = 1010 


13.6 Moves of a TM 


As discussed in the previous sections, there are three possible different types of moves, 
viz., 


@ Move to the left 
@ Move to the right and 
m@ No move. 


In this section, we give the formal definition to the moves of a TM. 
Formally, let M = (Q, &,T,5, qo, B,h) be a TM. Let the ID of M be 


(q, ai, a2, eons Qj—-|, 4, Qj+1 ee Gn). 
Consider the following transitions: 


a. 8(q,a;) = 5(P, b,L), for moving to the left. 
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Casei: Ifi > 1, then the move of TM in going from the ID (g, a,a2,..., aj—1, aj, 
aj+1..-An) to ID (P, ay... Aj—2, Aj—1, AjAj+1.-An) is denoted by: 


(q,4142 ...Aj—18j, Aj41...An) F (P, ay ...aj-2, Aji-1, b, j41 «.. An). 


Caseii: If i= 1, the TM crashes, as it is already scanning the leftmost symbol at cell 


i and attempts to move to the left, which is not possible. Hence, move is not 
defined. 


Case iii: If i= and B is the blank symbol, then 
(q, @1a2 ...An—1,4n, e) F (G,a1a2...An—2, An—1, B, e). 
b. 8(q,a;) = 5(P, b, R), for moving to the right. 
Casei: If i <n, then the move of TM is 
(q,@1 ..-@j—1, 8, Qi41-..An) K (P, a)... aj-1,b aj41, Gj42... an). 


Caseii: Ifi=nthen 
(q, 41 ..-An—1,@n,e) F (P,a,...,B,e). 
5(q, aj) = (P,b, N) when head does not move. 


Then, the move is denoted as 
(4,41... Qj—1, i, 2i41 ...An) F (P, a1... aj-1, b, aj41... an). 


Note-1: e marks the end of the string. 


EXAMPLE 13.6.1: Consider a TM, whose action is described in the transition table shown 
in table 13.8. 


0 1 B 


~ = (q2, B, R) 
(q2,0,R) | (qi,B,R) | (h,B,N) 
(q2,1,L) | (qi,1,R) | (g2,B,R) 


Table 13.8 Transition table 
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The action of TM for the string w = 0101 is as follows: 


% 


Figure 13.20. Tape with Symbols 0101 


Consider the following transitions: 
a. 5(go,B) = (q2,B,R), this is represented by the move as 
qoB0101B /— Bq20101B 


b. 6(q2, 0) = (q2, 1,L) 
Bq20101B }— q2B1101B 


c. 8(g2,B) = (q2,B,R) 
q2B1101B -}— Bq21101B 


d. 6(q2, 1) = (qi, 1,R) 
Bq21101B -— B1q,101B 


e. 5(q1, 1) = (qi, B,R) 
B1q,101B -}— B1Bq,01B 


f. 5(q1,0) = (q2,0, R) 
B1Bq\01B }— B1B0q21B 


g. 5(g2,1) = (qi, 1,R) 
B1B0q21B |}— B1B01q\B 


h. 8(q1,B) = (h, B,N) 
B1B01qB | — B1BO1Bh 


Thus the move is: 


goB0101B + Bq20101B F q2B1101B + Bqz1101B + B1lq\101B + B1Bq)01B + 
B1B0q21B + B1B01q\B + B1BO1Bh 


The equivalent notation is: 


* 
qoB0101B }— B1BO1Bh. 
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13.7 String Classes in TM 


Every Turing machine TM, over the alphabet &, divides the set of input string w into three 
classes: 


a. Accept (TM) is the set of all strings w € &*, such that, if the tape initially contains 
w and the TM is then run, then TM ends in a HALT state. 

b. Loop (TM) is the set of all strings w € &*, such that, if the tape initially contains w 
and the TM is then run, then the TM loops forever (infinite loop). 

c. Reject (TM) is the set of all strings w € &* such that any of the following three 
cases arise: 


Case (i): There may be a state and a symbol under the tape head, for which 5 does not 
have a value. 
Case (ii): If the head is reading the leftmost cell (cell i), containing the symbol x, the 


state of TM is say q then 5(q,x) suggests a move to the left of the current 
cell. However, as there is no cell to the left as of the leftmost cell, no move 
is possible. 


Case (iii): If TM enters an infinite loop or if a TM rejects a given string w because of 
above two cases, we say that the TM crashes (terminates unsuccessfully). 


13.8 Language Accepted by a TM 


The language accepted by a TM is the set of accepted strings w € X*. 


Formally, Let M = (Q, £,T’,5, qo, B,h) be a TM. The language accepted by M denoted 
by L(M) is defined as 
L(M) = {w|w € D* and if w = aj,... dy, then 


* 
(Go, €, 41, 42,...An) }— Ch, Os x0 Dp ti Dj x slp) 


for some b1,b2...b, € T*} 
or 


L(M) = {w: qow a x1 hx}. 
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a. Turing Acceptable Language 
A language L over some alphabet is said to be Turing Acceptable language, if there 
exists a TM, M such that L = L(M). 

b. Turing Decidable Language 
A language L over Lie., L C &* is said to be Turing decidable, if both the languages 
L and its complement £* — L are Turing acceptable. 

c. Recursively enumerable language 
A language L is recursively enumerable, if it is accepted by a TM. 


We discuss in detail, the TM languages, in chapter 14-section 14.7. 


13.9 Role of TM’s 


The TMs are designed to play atleast the following three roles: 


a. Accepting devices for languages (similar to the role played by FAs and PDAs). 

b. Computer of functions 
In this role, a TM represents a particular function (say the SUM function which gives 
as output, the sum of two positive integers given as input). Here the initial input 
represents an argument of the function and the (final) string on the tape (when the 
TM enters the Halt State) is treated as the value obtained by the application of the 
function to the argument represented by the initial string. 

c. An enumerator of strings of a language, that outputs the strings of a language (one 
at a time) in some systematic order i.e. as a list. 


13.10 Design of TM’s 


The basic strategy for designing a TM is given below: 


a. The objective of scanning a symbol by the tape head is to know about the future 
status. The machine must remember the symbols scanned previously, by going to the 
next unique state. 

b. The number of states must be minimised. This can be achieved by changing the 
states: 

@ only when there is a change in the written symbol or 
m@ when there is a change in the movement of the tape head. 
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13.10.1 TM as Accepting Devices for Languages 
This concept is illustrated through the following examples. 
EXAMPLE 13.10.1: Design a TM that erases all non-blank symbols on the tape, over the 


alphabets {a, b} . 


Design Strategy: The TM in state go must perform the following operations: 


@ On input symbol a, replace a by B, move the tape head towards right and stay at go. 
m@ On input symbol b, replace b by B, move the tape head towards right and stay at go. 
@ On input symbol B, replace B by B, change the state to h and do not move tape head. 


Thus the TM M = (Q, &,T 45, go, B, h) is, 
M = ({qo,h}, {a,b}, {a, b, B}, 5, go, B, h) 


where 6 is given by: 


Transition diagram: 
b/B, R,a/B, R 


Figure 13.21. Transition Diagram for M to Erase All Non-Blank Symbols 


Transition table: 


Tape Symbol 
States . | 2 
< qo,B,R>|<qo,B,R> | <h,B,N > 
- - Accept 


Table 13.9 Transition table for M 


TM action for the string w = abab 


+ 


Qo 


Figure 13.22. The Tape 
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IDis: qoababB + BqobabB 
+ BBqoabB 
+ BBBqobB 
+ BBBBqoB 
+ BBBBBh 


Since the final state h is reached, the string abab is accepted. 


EXAMPLE 13.10.2: Design a TM that accepts the language of all strings, over the alphabet 
x = {a,b}, whose second letter is b. 


Design Strategy: 


Step-1: In state go 
@ On input symbol a, change to state q,, replace a by a and move the tape head 
towards right. 
m= On input symbol b, change to state g;, replace b by b and move the tape head 
towards right. 
Step-2: In state g; on input symbol b, replace b by b and move the tape head towards right. 
Step-3: In state qo 
@ On input symbol a, stay at q2, replace a by a and move the tape head towards 
right. 
@ On input symbol b, stay at g2, replace b by b and move the tape head towards 
right. 
™ On input symbol B, change to state h and do not move the tape head. 


Thus, the TM, M = (Q, &,T, 4, qo, B, h) is 
M = ({q0, 91, 92; A}, {a, 5}, {a, b, B}, 5, go, B, h), 


where 4 is given by: 
Transition diagram: 


a/a,R 
b/b,R 
a/a,R C v 
a >@ b/bR (@) B/B.R 20) 
NE 


b/b,R 


Figure 13.23. Transition Diagram for TM 
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Transition table: 


Tape Symbol 
| States 


Figure 13.24. The Tape 


a. TM action for the string w = aba 


ID is: gqoabaB |— aq\baB 
[— abqraB 


}— abaq2B 
/— abaBh 


Since the final state h is reached, the string aba is accepted. 
b. TM action for the string w = aaa 


% 


Figure 13.25. The Tape 
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IDis: gqoaaaB |}— aqjaaB 
Transition is not defined for (gi,a). The machine halts and the string w = aaa is 


rejected. In other words, we say that the TM crashes on giving string w as input. 


EXAMPLE 13.10.3: Design a TM which accepts all strings of the form a"b” forn > 1. 
Design strategy: 
Let qo be the start state and the tape head point to the first symbol of the string to be scanned. 


Step-1: In state go, 


@ on input symbol a, change to state q), replace a by ‘x’ and move the tape head 
towards right. 


a if you encounter y, change the state to q3, replace y by y and move the tape head 
towards right. 


Step-2: In state q,, search for the leftmost b and replace it by y. Now, move the head to 
point to the leftmost b. When the head is moved towards b, the symbol encountered 
may be a or y. Irrespective of what symbol is encountered, replace a by a, y by y, 
remain in state g; and move the read towards right. 
Transitions are : 5(q1,4) = (q1,4,R) 


5(q1,¥) = (41, y,R) 


Step-3: In state g2, search for the rightmost x to get leftmost a. During this process, the 
symbols encountered may be y’s and a’s. Replace y by y, a by a, remain in state 
q2 and move the head towards left. 


Transitions are : 5(q2,y) = (q2,y,L) 
6(q2,a) = (q2,a, L) 


Once rightmost x is obtained. To get the leftmost a, replace x by x, change the 
state to go and move the head towards right. 


Transition : 5(q2,x) = (qo, x, R) 


Step-4: In state g3, search for y or blank B. On encountering y replace it by y, remain in q3 
and move the head towards right. On encountering B, change to state q4, replace 
B by B and move towards left. 
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Thus, the TM, M = (Q, 2,1, 5, qo, B, h) is 
M = ({qo, 41,92, 93; 94}; {a, b}, {a, b, x,y}, 5, go, B, {qa}), 


where 6 is given by: 
Transition diagram: 


y—>»y, z y—»y, L 


leu SS L 
< B—>B,L 
y—>), 
X_»x,R 
Figure 13.26. Transition Diagram for TM 
Transition table: 
So Symbol 

States ° * y . 
qo (q1,x, R) = (q3,y,R) = 
ql = (41 , ys R) = 
q2 (q2, 4, L) (qo,x,R) | (q2,y,L) - 
q3 _ aad (q3, y, R) (q4, B, L) 


Table 13.11 Transition table for TM 


TM action for the string w = aabb 


Figure 13.27. The Tape 
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ID is: qoaabbB | — xq,abbB 
/— xaq,bbB 
[— xqzaybB 
t— q2oxaybB 
}— xqoaybB 
[— xxq1ybB 
[— xxyqibB 
[— xxqryyB 
[— xq2xyyB 
[— xxqoyyB 
[— xxyq3yB 
[— xxyyq3B 


[— xxyyq4 


Since the final state q4 is reached, the string aabb is accepted. The figure 13.28(a-n) below 
shows the action of the TM for the string aabb. 


Beare DECOR: 
] y—y_R 


t 
q 


yo y,L 


0 aa L 
ys 
B+BL 
= 


AY ae me 


, =e 
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q, 
y—ry,R 


yo y,L 
a s+alL 


’ 


y—y,R 
a—+a@R 


BBL (\ 
(4, =»RG) a—+x,R joes) 


{ee 
Cc: 


BEBBEEE 
t 


4 y—y,R 


y—>y,R a—+aR 
BBL (\ (\ 

by L 
(4) —y EG) AV) “a 
Fas a - 


ae eB BEE 
t 


y—y,L 
a—>aL 


Figure 13.28. 


Pee 
f 


4 y—y,R yoy L 


y—>y,R a—+aR G@—+aD 


B+BL 


(4, jy ane ere 


je 


d: 


GEREE EE 
t 


BERGE RE 
t 


q, yoy, L 


a— a L 


y—y,R 
a—+aR 


BBL (\ 
@® b—+y,L 
ham Fass C) panmecat q, —Yy, 


Se 
h: f . 


a-n: The TM action for w = aabb. 
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BEEEBEE BEBEBGE 
t i 


a 


y—y,R Q-y,D yoy,R yoy,L 
pa R a—+aR 4>a@L y—+y,R ty “C 


Ba \, ae BBL 
ae RG) a—>X, 5G) (4,52 —y, "OF L Somat CO 
j: 


re. 


“TELE Doon 


% y—y,L 94, yoyR yy,L 


y—y,R 
as+aL a—+a4R 4+4L 


a—+aR 


BBL C\ Ql @ XO: 2 
>), Dg iG) $=¥RG@ aa simi 8 


[= 


Goosocom 
f 


q4 
Halt & Accept yo3yR yoy, L 


@) a—+aR a>+aL 


Cae Ch ee ef me 7. ae santhpon dh 


OTe ee 


Figure 13.28. a-—n: The TM Action for w = aabb 


EXAMPLE 13.10.4: Design a TM to accept the language L(M) = {a"b"c"|n > 1}. 


Design strategy: Let qo be the start state and the tape head point to the first symbol of the 
string to be scanned. 


Step-1:. m In state go on input symbol a, change to state q;, replace a by x and move the 
tape head towards right. 
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w In state gq; on input symbol 5, change to state g2, replace b by y and move the 
tape head towards right. 
m In state gz on input symbol c, change to q3, replace c by z and move the tape 
head towards left. 
Step-2: In state q;, search for the leftmost b. In the process of searching the symbols a or 
y may be encountered. So replace a by a, y by y and move the head towards right 
and stay in state q1 


Transitions are : 5(q1,a) = (qi, 4, R) 
8(q1,¥) = (q1,y,R). 

Step-3: In state q2, search for the leftmost b. In this process of searching, the symbols b 
or z may be encountered. So, replace b by b, z by z, move the tape head towards 
right and remain in state q2. 

Transitions are : 5(q2,b) = (q2, b, R) 
5(q2,2) = (q2,2,R). 

Step-4: TM in state g3 means that equal number of a’s, b’s and c’s are replaced by equal 
number of x’s, y’s and z’s. In q3, search for the rightmost x to get the leftmost a. 


During this process, the symbols z, b, y,a and x may be encountered. So replace 
them by the same respective symbols, move the tape head towards left and stay in 


q3- 


Transitions are : 5(q3,2) = (q3,2,L), 8(q3,b) = (q3,b,L) 
5(q3,y) = (q3,y,L), 5(q3,4) = (q3,a,L) 


Now once x is encountered, replace x by x, change to state go and move the head 
towards right to get the leftmost a. 


Transition : 5(q3,x) = (qo, x, R) 
Step-5: In state go on input symbol y, change to state qg4, replace y by y and move the tape 
head towards right. 
Step-6: In state g4 on input symbol y, stay in q4, replace y by y and move the head towards 
right. 


Transition : 5(q4,y) = (qa, y, R). 
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If z is encountered in qa, change to qs, replace z by z and move the head towards 
right. 


Transition : 5(q4,Z) = (45,2, R). 


(TM in q4, with input z means that there are no b’s and no c’s). 

Step-7: In state g5 on input symbol z, stay in gs, replace z by z and move the head towards 
right, continue to be in qs for the input symbol z, so that there are only z’s and no 
more c’s. However, if B is encountered once, change to state gg, replace B by B 
and move the head towards right. 


Transitions : 5(q5,Z) = (q5,2,R) 
5(qs,B) = (qo, B, R) 


Thus M = (Q, 2,T 5, qo, B, A) i.e., 
= ({q0, 15 92> 93> 94,955 9}; {a, b, ch, {a, b, C,X,Y, z}, 5, q0> B, 46); 


where 6 is given by: 


Transition Diagram: 


y—»y, R | 
er) 
Z—»zR , a, L 
b,L 
CO ok ict SE 
‘ R b__»b,R y—»y, L 
y—»y, R oo 
o= c—>z L 
a QO G—4G 


x—x,R 


Figure 13.29. Transition Diagram for M 
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Transition table: 
sa a b ta x y z B 
qo (qi,x,R)  - = — ((q4,y,R¥  - - 
q1 (q1,4,R (q2,y.R _ _ (91, y, R) = a 
q2 i q2, b,R q3> z,L) = _ (q2, 2, R) a 
QB (q3,4,L)(q3,b,L) —  \(qo.x,R\(43.y. oe - 
94 = = = — (q4,y,R\q5,z,R) - 
q5 = zh a - _ (95,Z, RK Qo, B, R) 


Table 13.12 Transition table for M. 


TM action for the string w = abc 


Figure 13.30. The Tape 

ID is: qoabcB |— xq,bcB 
[-— xyq2cB 
[-— xq3yzB 
-— q3xyzB 
[— xqoyzB 
[— xyq4zB 
[— xyzqsB 
HL — xyzqo. 

Since the final state g¢ is reached, the string w = abc is accepted. 


EXAMPLE 13.10.5: Design a TM that recognises the language L of all strings, over {a, b}, 
with number of a’s equal to the number of b’s. 


Design Strategy: Let go be the initial state and the tape head points to the first symbol of 
the string to be scanned, which can either be a or b. The following cases are considered, 
based on the next input symbol to be scanned. 
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Case-1: Next input symbol to be scanned is B. 


Change the state from qo to h, replace B by B and move the head towards right. 


Transition : 5(qgo, B) = (h, B, R). 
Case-2: Next input symbol to be scanned is a. 


In state go on input symbol a, skip all subsequent symbols till the symbol b is encountered. 
Then, come back to the next leftmost symbol and repeat any of the three cases based on the 
next symbol to be scanned. 


Transitions are : 5(qo,y¥) = (qo, y, R) 
5(qo, 4) = (q1,x,R) 
5(q1,4) = (41,4, R) 
6(91,y) = (41, y,R) 
5(q1,5) = (q2,y,L) 
5(q2,y¥) = (q2,y,L) 
(92,4) = (q2,a,L) 
5(q2,x) = (qo, x, R). 


Case-3: Next input symbol to be scanned is b. 


In state go on input symbol b, skip all subsequent symbols till the symbol a is encountered. 
Then, come back to the next leftmost symbol and repeat any of the three cases based on the 
next symbol to be scanned. 


Transitions are : 5(qo,¥) = (go, y, R) 
5(qo,b) = (q3,x,R) 
5(q3,b) = (q3,b, R) 
5(q3.¥) = (43, ys R) 
5(q3,a) = (qa, y,L) 
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5(q4,y) = (ga, y,L) 
5(q4,b) = (q4,b, L) 
and 5(q4,x) = (go, x, R). 


Thus, the TM M = (Q, £,T, 5, go, B,h) is 


M= ({g0, 41+ 92: 93,94, hj, {a, b}{a, b, x,y}, 5,0, B, h) 


where 6 is given by: 


Transition diagram: 


Figure 13.31. Transition Diagram for M 
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Transition table: 


Tape Symbol 
qo (q1,x,R) | (q3,x,R) = (go.y,R) | (h, B, R) 


q1 (q1,4,R) | (gz, y,L) - (q1,y,R) _ 
q2 (q2,a, L) = (qo,x,R) | (q2,y,L) - 
3 (qa,y,L) | (93,0, R) _ (q3,y, R) - 
q4 - (qa,b,L) | (go,x,R) | (ga, y,L) - 


Table 13.13 Transition table. 


TM action for the string w = ababab 


Figure 13.32. The Tape 


ID is: qoabababB |— xq, bababB 
t— qoxyababB 
}/— xqoyababB 
/— xyqoababB 
[/— xyxqibabB 
t+— xyq2oxyabB 
[— xyxqoyabB 
[— xyxyqoabB 
[— xyxyxqibB 
[— xyxyq2xyB 
[— xyxyxqoyB 
/— xyxyxyqoB 
[— xyxyxyh. 


EXAMPLE 13.10.6: Design a TM ‘parity counter’ that outputs 0 or 1, depending on whether 
the number of 1’s in the input sequence is even or odd respectively. 
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Design Strategy: Let go be the start state and the tape head point to the first symbol of the 
string to be scanned. The TM goes from one state to another, replacing 0 by 1 and 1 by 0. 
The TM ends in 0 (or 1) if the number of 1’s in the input sequence is even (or odd). 


Thus, the TM, M = (Q, 2,1, 5, qo, B, h) is 
M = ({qo. 91, h}, {0, 1}, {0, 1}, 5, go, Bh), 
where 6 is given by: 


Transition diagram: Nak 0/0, R 


CA 
Gn) 


1/0, R 


B/0, N BA,N 


Figure 13.33. Transition Diagram for TM 


Transition table: 


sue | 


(go,0,R) | (q1,0,R) | (A, 0,N) 
(1,0,R) | (go,0,R) | (h, 1,N) 


Table 13.14 Transition table 


TM action for the input w = 10110101 


Figure 13.34. The Tape 


IDis:  go10110101 }-~ 0q,0110101 
L— 00g;110101 


517 


Downloaded from https://www.cambridge.org/core. Stockholm University Library, on 06 Dec 2018 at 08:05:31, subject to the Cambridge Core terms of use, available at 
https://www.cambridge.org/core/terms. https://doi.org/10.1017/UPO9788175968363.014 


A Textbook on Automata Theory 


t— 000g010101 
L— 0000410101 
L— 00000g; 101 
L— 0000004901 
-— 0000000g01 
L— 00000000gB 
L— 000000001 


Since the final state h is reached, the TM halts with the output 1 (odd parity). 
EXAMPLE 13.10.7: Design aTM ‘parentheses checker’, that outputs 1 or 0, depending on 
whether the sequence is properly formed or not. 


Design Strategy: Let qo be the start state and the tape head point to the first symbol of the 
string to be scanned. (The machine always starts with the leftmost bracket symbol (‘in’ state 
qo). Thus the input to TM is a sequence of left and right brackets. The output is a 1 or 0, 
depending on whether the sequence is properly formed or not. 


Thus, the TM is M = (Q, ,T’, 5, qo, B,h) ie., 


M= ({40. q1,92,h}, {G )}, {( ), x, 0, 1}, 5, 90, B, h), 


where 6 is given by: 


Transition diagram: 
x/x, L x/x, R 
/), L WR 
Y) Vx, L BIB, L x/x, L 
a a 
(4) Start 
(ix, R = 
~ 
BIO, N eS 
Figure 13.35. Transition Diagram for TM 
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Turing Machine 


Transition table: 


< q0,(,R > 


<qo.x,R> 
<h,0,N > 


Table 13.15 Transition table for TM 


TM action for input string w = (()) 


Figure 13.36. The Tape 


Dis: —_qo(Q)) F (go0) 
F ((o)) 
F (qi (x) 
- (xqgox) 
F (xxqo) 
F (xqixx 
F (qixxx 
F qi (xxx 
 xqoxxx 
+ xxqoxx 
F xxxqox 
F xxxxqoB 
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F xxxqoxB 

/ xxq2xxB 

/ xqoxxxB 

F qoxxxB 

F qoBxxxxB 

+ hlxxxxB[2pt] 


Since the final state h is reached, the TM halts with the output 1 (balanced parentheses). 


EXAMPLE 13.10.8: Design a TM that copies a given string over {a, b}. Find the computation 
of TM for the string abb. 


Design Strategy: Let qo be the start state and the tape head point to the blank symbol on 
the tape. 


The TM in state q, scans the leftmost a or b, replaces it by x or y and copies it in the next 
available B (the first B on the right is left as marker and will not be available). If the symbol 
in the state q) is a, the TM (while skipping symbols) passes through the state gz and reaches 
q4. However, if the symbol in state q; is b, then the TM (while skipping the symbols) passes 
through the state q3 and reaches the state gs. Then the TM copies the symbol and reaches 
the state gg. Next, the TM starts its leftward scan, skipping over a’s, b’s, x’s, y’s and B, and 
meets x or y in q7. At this stage, TM goes to the state q;, then repeats the whole process 
until the whole string is copied in the second part of the tape. 


Finally, TM goes from qj to state gg to replace each x by a and each y by b. 


Initial tape with w = abb Final tape after TM action 


Figure 13.37. TM to Copy w = abb 
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Transition diagram: 


Figure 13.38. Transition Diagram for TM 


Transition table: 


Tape Symbol 
Stee | a b | x y B 
Jo = = = = (q1,B,R) 
Vil (q2,x,R) (q3,y,R) ~~ | = (43, B, L) 
q2 (92,4, R) (q2, b, R) me Ee ECZD B, R) 
93 (q3,4, R) (93, b, R) 7 7 (qs, B, R) | 
94 (q4,a,R) | (g4,0,R) = = (qo, 4, L) 
95 (45,4, R) (45,5, R) — a (46, b, L) 
6 (a6, 4,L) | (qo, 5, L) = = (q7,B,L) | 
L q7 wad, (47,5, L) (q1,x,R) (91, y,R) 
8 = 7 (48, a, L) (48,5, L) (h, B,N) 


Table 13.16 Transition table for the TM 
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Thus the TM is M = (Q, 2,1, 6, qo, B, h) 
where Q = {40.915 92» 93> 94, 95, 96, h} 


x = {a,b} 

T = {a, b,x, y} 
qo = {qo} 
B=B 

h= {h} 


and 6 is shown in figure 13.38 and table 13.16. 
TM action for the string w = abb 

ID is: qoBabb + Bq, abbB 
+ Bxq2bbB 
t+ BxbqrbB 
+ BxbbqoB 
- BxbbBq4B 
- Bxbbqe6Ba 
+ Bxbq7bBa 
+ Bxq7bbBa 
+ Bq7xbbBa 
+ Bxq, bbBa 
+ Bxyq3bBa 
+ Bxybq3Ba 
+ BxybBqsa 
+ BxybBaqgsB 
+ BxybBqgab 
t+ Bxybq6Bab 
+ Bxyq7bBab 
+ Bxq7ybBab 
+ Bxyq,bBab 
+ Bxyyq3Bab 
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+ BxyyBqsab 

+ BxyyBaqsb 

+ BxyyBabqsB 
+ BxyyBagebb 
+ BxyyBqoabb 
+ BxyyqsBabb 
+ Bxyq7yBabb 
Ft Bxyyq; Babb 
+ BxyqsyBabb 
t+ BxggybBabb 
+ BagxbbBabb 
+ qgBabbBabb 
t+ hBabbBabb 


Since, the final state h is reached, the string abb is accepted. 


EXAMPLE 13.10.9: Design a TM that accepts all the palindromes over the alphabet {a, b}. 


Design strategy: Let go be the start state and the tape head point to the first symbol of the 
string to be scanned. 


Step-1: In state go, with the input as blank symbol B, the machine has found a palindrome 
of even length. 

Step-2: In state go, with the input as non-blank symbol ‘a’, the machine replaces it by B 
and enters the state q, in which all the a’s and b’s are skipped. If B is encountered, 
then it goes from q1 to q3 to find a (matching a in last non-blank symbol position). 
If there is a match, then it goes to q5 and replaces ‘a’ by ‘B’. 

Step-3: In state qo, if only B’s are encountered, it means the previous ‘a’ was the middle 
symbol of the given string. Then, the machine has found the palindrome of odd 
length. 

Step-4: For an input of non-blank symbol ‘b’ in state go, steps similar to 2 and 3 follow 
except that the next state is go (with a’s and b’s interchanged). 

Step-5: When the machine in state q3 finds b or finds a when in state qq, then the string 
under coorideration is not a palindrome. 


Thus the TM is, M = (Q, &,T, 5, qo, B, h) i.e, 


M = ((90, 91,92, 93,94,95,h}, {a,b}, {a,b},5,q0,B,h), 
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where 6 is given by: 


Transition diagram: 


B/B, R 


a/a, R B/B, R 
b/b, R 


Figure 13.39. Transition Diagram for M 


Transition table: 


Tape Symbol 


States a b B 
40 (q1.B,R) | (q2,B,R) | (A, B,R) 
71 (qi,a, R) (q1, 5, R) (q3, B, L) 
q2 (q2,4,R) | (q2,b,R) | (q4,B,L) 
3 (qs5,B, L) - (h, B, R) 
q4 S (q5,B,L) | (h,B,R) 
95 (qs,a,L) | (qs,b,L) | (qo, B,R). 


Table 13.17 Transition table for M 
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TM action for the string w = bab 


Figure 13.40. The Tape 


ID is: qobab + Bqzab 
+ Baqzb 
+ Babq2 
+ BaqabB 
-+ BqsaBB 
+ qsBaBB 
+ BqoaBB 
+ BBq, BB 
+ Bq3BBB 
+ BBhBBB. 


Since the final state h is reached, the string is accepted. 


EXAMPLE 13.10.10: Design a TM that recognises the language consisting of all strings of 
0’s, whose length is a power of 2. i.e. L = {07" |m > O}. 


Design Strategy: Let go be the start state and the tape head point to the first symbol of the 
string to be scanned. 


Step-1: In state go on input symbol 0, replace 0 by B and move the tape head towards right. 

Step-2: Scan from left to right across the tape, replacing every 0 by x, whenever it is 
required. 

Step-3: If the string contains a single 0, then the string is accepted by the TM. 

Step-4: If the string contains more than a single 0 and the number of 0’s are odd, then the 
TM goes to the state gs and the string is rejected. 

Thus, the TM is, 


M = (Q, x, r, 6, qo, B, h) 1é., 
M = ({40, 91,92, 93, 94,95, h}, {0}, {0, x}, 6, 0, B, h), 
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where 6 is given by: 
Transition diagram: 


Figure 13.41. 


Transition Diagram for M 


Transition Table: 


x B 


94 


(q4,0,L) 


90 (41, B,R) (95,x, R) (qs, B, R) 
71 (q2,x,R) (q1,x,R) (h, B, R) 
q2 (q3, 0, R) (q2, x, R) (44, B, L) 
43 (q2,x,R) (q3,x,R) (95, B, R) 


(q4,x,L) 


Table 13.18 Transition table for M. 


TM action for the string w = 0000. 


Figure 13.42. The Tape 
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Turing Machine 


ID is: qo0000 F Bq,000 
- Bxq200 
k Bx0q30 
+ Bx0xq2 
+ Bx0g4x 
tk Bxg40x 
F Bqax0x 
k q4Bx0x 
+ Bqyx0x 
+ Bxq,0x 
t+ Bxxq2x 
F Bxxxq2 
+ Bxxq4xB 
+ Bxq4xxB 
+ Bq4xxxB 
- q4BxxxB 
- BqyxxxB 
- Bxq,xxB 
+ Bxxq,xB 
F Bxxxq\B 
+ BxxxBh. 


Since the final state h is reaached, the string 0000 is accepted. 


Theorem 1 Every regular expression has a TM that accepts it. 


Proof. We know that there is a DFA for every regular language. Draw a DFA for the 
language. Change the edge labels from a and b to a/a,R and b/b,R. Add a halt state. 
Take away the accept status of the accept states and add an edge from each one, labelled 
B/B, N, to the halt state. 
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Construction of a TM from the given DFA 


The following are the steps required: 


Change the edge labels of a given DFA from a and b to a/a, R and b/b, R. 
Add a halt state. 


Take all the ‘accept’ states of DFA and add an edge from each one, labelled 
B/B, N, to the halt state. 


EXAMPLE 13.10.11: Construct a TM to accept the language containing strings of 0’s and 
1’s ending with 00. 


Solution: The transition diagram of DFA, which accepts the language consisting of strings 
of 0’s and 1’s ending with the string 00, is shown in figure 13.43. 


1 0 
«x Q-.@-G 
Nae ail 
i 


Figure 13.43. DFA to Accept L = (0 + 1)*00 


Change the edge labels of the above DFA from 0 and 1 to 0/0,R and 1/1,R. Add the halt state 
h and add an edge lablelled B/B, N from q2 to h. 


W/1,R 0/0,R 
start > (do) WOR > Cg, WOR BIB.N 
VLR 
WAR 


Figure 13.44. Transition Diagram for TM to accept L = (0 + 1)*00 
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Thus the TM is, M = (Q,2,T',6,q0,B,h) i.e., 
M= ({q0, q1>92, h}, {0, 1}, {0, 1}, 5, qo, B, h). 


The transition 5 is given in table 13.19: 


0 1 B 


(q1,0,R) | (go, 1,R) 
(q2,0,R) | (go, 1,R) = 
(q2,9,R) | (go. 1,R) | (,B,N) 


Table 13.19 Transition table for M 


EXAMPLE 13.10.12: Construct a TM over the alphabet {0,1}, that contains set of strings 
of 0’s and 1’s except those containing the substring 001. 


Solution: The transition diagram of DFA, which accepts the language consisting of strings 
of 0’s and 1°s except those containing the substring 001, is shown in figure 13.45: 


rs 
—§+9+6+ 6 


Figure 13.45. DFA to accept all Strings of 0’s & 1’s except those containing the Substring 
001 


Change the edge labels of the above DFA from 0 and 1 to 0/0,R and 1/1,R. Add the halt state 
h and add an edge labelled B/B, N from qo to h, gq; to h and q2 toh. 


1/1,R 
a Fan.) 20.8.) Gua oO 
B/B,N 
ON ee 
@ 


Figure 13.46. JM to accept all Strings of 0’s and 1’s except those containing the 
Substring 001 
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Thus, the TM is, 
M= (Q, x, I’, 5, qo, B, h) i.e., 
M= ({go, 415925 43> h}, {0, 1}, {0, 1}, é, go> B, h). 


The transition 5 is given in table 13.20. 


(q1,0,R) | (go, 1,R) 


(q2,0,R) | (go, 1,R) 
(q2,0,R) | (q3,1,R) 
(q3,0,R) | (q3,1,R) 


Table 13.20 Transition table for M 


13.10.2 TM as a Computer of Functions 


Instructions of TM means that if the TM is in the current state and the head is pointing to 
a given symbol, then the TM goes from one state to another, replaces the symbol by a new 
symbol and moves the head in the given direction (left or right). In some situations, the 
TM will have no well-defined instructions and it would halt in such a situation. Defining 
instructions of a TM can be thought of as programming the TM. 

For programming the TM to perform some mathematical functions, several conventions 
are commonly used: 


a. Interpretation of the symbols recorded on the tape To describe how to interpret 
the ones and zeros appearing on the tape as numbers, the number is represented in 
unary notation. It means that the non-negative integer number n is represented by 
using successive 1’s. For instance, 2 is 11, 3 is 111, 4 is 1111 and so on. 

b. Ifa function f (71, 72,...,n,) has to be computed, assume that initially the tape 
consists of 11,n2,...,nz. Also, each sequence of 1’s is separated from the previous 
one by a single blank or any other special symbol. With the tape head initially located 
at the rightmost or leftmost bit of the first input argument, the state of the TM is some 
initially specified value. 

The TM is said to have computed m = f(m1,n2,...,n), when it halts and the tape 
consists of the final result. The head is positioned at the rightmost or leftmost bit of 
the result. 

c. To ascertain when the machine is started and when it finishes, some special symbols 
are included in the tape. 
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EXAMPLE: TM to compute the function m = multiply(n,,n2) = m x np. If ny = 3 and 
nz = 2, then the tape in the beginning of computation is given in figure 13.47. 


Figure 13.47. The Tape in the Beginning of the Computation 


After the computation is over i.e n, x nz = 3 x 2 = 6, the TM halts, with its tape looking 
as shown in figure 13.48. 


Figure 13.48. The Tape After Computation 


EXAMPLE 13.10.13: Construct a TM that finds the difference of two natural numbers: 


Design Strategy: We are to design a TM to compute the function m = SUB(n;, 2) such 
that: 


_jm-m if m2m 
SUB UH.) = 0 if nj<m 
Initially, the unary sequence is written onto the tape with two numbers n, and n2 separated 
by a blank symbol B. The symbol A on tape indicates the beginning and end points of the 
sequence. The tape head initially points to the symbol B, with qo as the start state. The initial 
configuration of the tape with n; = 3 and n2 = 2 is shown in figure 13.49. 


Figure 13.49. The Tape’s Initial Configuration 
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Thus, the TM is M = (Q,2,T,6,q¢0,B,h) ie., 
M _ ({g0. 71; q2; q3, 4, q5> 96; h}, {0}, {0, 1, x, A}, 5, qo, B, h), 


where 6 is given by: 


Transition diagram: 


A/, R | iL | AN, R 
, L 
B/B, L 

x/x, R 

B/A,R AA, R 

1/1,R 
xA,R 
A/A, | O/1, R 


St 
a —(@) 
O/A, N 
Figure 13.50. 7M to Compute SUB Function 


TM action for the input n; = 3 and nz = 2 


ID is: AlllgoB11A + Allgo1B11A 
Fk Allxq,B11A 
F AlIxBq, 11A 
F AllxqoBx1A 
F AllqoxBx1A 
F AlgolxBxlA 
+ Alxq\xBxlA 
+ Alxxq,Bx1A 
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t AlxxBq\x1A 
kt AlxxBxq 1A 
F AlxxBqoxxA 
FA LxxqoBxxA 
F AlxqoBxxA 
F AlgoxxBxxA 
F Ago lxxBxxA 
bk Axq,xxBxxA 
- Axxq,xBxxA 
tk Axxxq,BxxA 
k AxxxBq,xxA 
F AxxxBxq,xA 
k AxxxBxxq\A 
Fk AxxxBxq4x0 
t AxxxBq4x00 
+ Axxxq4B000 
+ Axxq4xB000 
- Axq4x0B000 
+ Ag4x00B000 
F q4A000B000 
+ Aqgs5000B000 
+ Alge00B.. 

F AIAA.. 


Since the final state 4 is reached, the TM halts resulting into a difference of 3 and 2 as 1. 
EXAMPLE 13.10.14: Construct a TM that finds the sum of two natural numbers. 
Design Strategy: We are to design a TM to compute the function 


M = Sum(nj,n2) = ny +12. 
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Initially, the unary sequence is written onto the tape with two numbers n; and n2 separated 
by a symbol B, the symbol A on tape indicates the beginning and end points of the sequence. 
The tape head initially points to the leftmost bit of n,. The initial configuration of the tape 
with n; = 4 and n2 = 3 is shown in figure 13.51. 


Figure 13.51. The Tape’s Initial Configuration 


Thus the TM is, M = (Q, Z,T, 5, go, B, h) i.e., 


M = ({qo, 91,92, h}, {1}, {1, A}, 5, go, B, A) 


where 6 is given by: 


Transition diagram: 
1A,R 1A,R 


start BA, R A/B, L 


1/A, N 


Figure 13.52. TM to Compute the Sum Function 


TM action for the input 2; = 4 and n2 = 3 


IDis:  Agol111B111A + Algoll1B111A 
+ Allgol1B111A 
+ Alllqo1B111A 
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+ Al111qoB111A 
F A11111g) 111A 
F Al11111q)11A 
F Al111111q)1A 
F AL1111111q1A 
F A1111111q21B 
t ALL11111Ah.. 


Since the final state h is reached, the TM halts resulting into the sum of 4 and 3 as 7. 


EXAMPLE 13.10.15: Construct a TM to compute f(n) = n + 2. 


Design Strategy: We are to design a TM to compute the function m = f(n) =n + 2. 


Initially, the unary sequence for n is written onto the tape with the symbol A on both the 
ends of n, indicating the beginning and end points of the sequence. The tape head points to 
the leftmost bit of n, with go as the start state. The initial tape configuration with n is shown 
in figure 13.53. 


Figure 13.53. The Tape’s Initial Configuration 


Thus the TM is, M = (Q, 2,1’, 8,0, B, A) ie., 


M = ({40, 71>42; h}, {1}, {1, A}, 6, go, B, h), 


where 6 is given by 
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Transition diagram: 


1/1, R 
start mE (a) BA,R (4, ) 


B/A, R 


Figure 13.54. TM to Compute f (n) 


TM action for input n = 2 
ID is: AqollAt AlgolA 
F AllgoA 
F All1q\B 
+ All11q2B 
F A1111Ah 


Since the final state h is reached, the TM halts, resulting into a sum of 2 and 2 as 4. 


EXAMPLE 13.10.16: Design a TM to compute max(n, 72). 


Design Strategy: We are to design a TM to compute the finction m = max(m,72) such 
that: 


ny if ny en 


maxii5 tha) = ng if nye<n 


Initially, the unary sequence is written onto the tape with two numbers 7 and n2, separated 
by a blank symbol B. The symbol A on tape indicates the beginning and end points of the 
sequence. The head initially points to the symbol B, with go as the start state. The initial 
tape configuration with nj = 1 and nz = 2 is shown in figure 13.55. 
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Figure 13.55. The Tapes Initial Configuration 


Transition diagram: 


Al | we i L 
B/A, R ‘.’ L 
wl, R wi, L 
1/1, R VL 
AAA, | 
nN 


Figure 13.56. TM to Compute max(n,,n2) 
TM action for input n; = 1 and nz = 2: With = {0,1,A,x},& = {0,1} and 
Q = (90, 91, 92; 93, 94 95, h} 
ID is: AqolB11A t Axq,B11A 

Ft AxBq11A 

+ AxqoBx\A 

F AqgoxBx1A 

- qoAxBx1lA 

+ 0g2xBxlA 

+ 00q2Bx1A 

F 00Aqg3x1A 

F 00A1q31A 

F 00A1193A 

F .ALLAh 
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Since the final state h is reached, the TM halts, resulting into a maximum of 1 and 2 as 2. 


EXAMPLE 13.10.17: Construct a TM that finds the product of two natural numbers. 
Design Strategy: We are to design a TM to compute the function 


m = multiply(n,n2) = ny X n2. 


Initially, the unary sequence is written onto the tape with two numbers n; and n2, separated 
by a blank symbol B. The symbol A on the tape indicates the beginning and end points of the 
sequence. The tape head initially points to symbol B, with go as the start state. The initial 
configuration of the tape with nj = 2 and n2 = 3 is shown in figure 13.57. 


Figure 13.57. Tape’s Initial Configuration 
Thus, the TM is M = (Q, UT’, 5, go, B,h) ie., 


M = ({90, 91,93, 94,95, 96, h}, {1}, {0, 1,4, x}, 8, go, B, A), 


where 6 is given by Transition diagram: 


B/B, R 
0/0, L xX1,R 
B/B, L 1/1, R 


—_——> 


AN, & MA, L 
VX.R 
A/A, R 
50. R boop. (a) ee 
R XX R 
AJA, R oy O/K, L 
AVAL 
mC) XX. L 
B/A, | 


Figure 13.58. TM to Compute the Multiply Function 
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TM action for the input n; = 2 and n2 = 3 


i, 


iii. 


Vi. 


Vii. 


Viii. 


ix. 


Xi. 


Xil. 


Xiil. 


A1IB111A 
& 
A1OB111A 
t 
q 
Move right until! A10B111A 
H 
After 3 operations we have A10B11xA 0 
t 


94 


ALOB11xAx 
2 
Move left until A10B1 1 xAx 
+ 


93 


Replace 1 by x and move right until AlOBlxxAx 0 
; 


a4 This is a loop 
. writing n 1’s on the 
Replace 0 by x and move left until A10B i XXAXX RHS. of the 
% rightmost A. 


Replace 1 by x and move right until AlOBxxxAxx 0 
t 


94 


. Replace 0 by x and move left until A10 B xxxAxxx 
t 


3 


Move left until A 1 OBxxxAxxx 
t 
90 
(w to x) is a loop writing ‘n’ once on the RHS of the right most A. 


Replace 1 by 0 and move right, changing x’s to 1’s until 
AOOB111 A xxx 
t 
q 


Now enter the loop again to write n once more, over the zeros on the right of the 
rightmost A until 
AO0O B xxxAxxxxxx 
if 


3 
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xiv. Now move left until 


A OOBXXxXAXXXXXX 
a 
xv. Move right until 
000 B xxxAxxxxxx 
HM 
xvi. Move right until 
0000000 A xxxxxx 
t 


45 


xvii. Move right until x replaces 1 to get the final output. 


OQO00000A111111A 


EXAMPLE 13.10.18: Construct a TM to compute n. 
Design Strategy: We are to design a TM to compute the function 


m = Ssquare(n) =n x n. 


Consider the initial configuration of tape with n = 2, as shown in figure 13.59. 


Figure 13.59. TYape’s Initial Configuration-1 


First convert the above tape into the form, shown in figure 13.60. 


Figure 13.60. Tape’s Initial Configuration-2 


540 


Downloaded from https://www.cambridge.org/core. Stockholm University Library, on 06 Dec 2018 at 08:05:31, subject to the Cambridge Core terms of use, available at 
https://www.cambridge.org/core/terms. https://doi.org/10.1017/UPO9788175968363.014 


Turing Machine 


Thus, we require two TMs — one to convert from configuration-1 to configuration-2 and 
another to multiply n by n. 

The transition diagram of TM, to convert from configuration-1 to configuration-2 is 
shown in figure 13.61. 


a a 
Qe —— © 


x/x, "| BA, L 


ee B/A,R (@)—aR, 1x, R @ae B/B, R ABE SCG) uk 


B/B, L 
7 R 


(@)-44.25(@) 2B Cae apa) te 
we Ww) ne 


1/1, R Ke R 


TM2 


Figure 13.61. Transition Diagram for TM1 


TM action for input n = 2 with Q = {90, 91, 92,94, 96,975 98> 99, 910-911}, 4 = {1} 
and Pr = {1,X,A}. 

IDis: goBllt Aqi11 
k Axg21 
+ Axlqo 
F Ax1Bq4 
+ AxlqoB1 
+ Axg71B1 
F Aq7x1B1 
+ Axg,1B1 
F Axxq2Bl 
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+ AxxBqa4l 

+ AxxBlqa 

+ AxxBqg611 

+ AxxgoB11 

+ Axq7xB11 

+ Axxq, B11 

+ AxggxB11 

- Aggx1B11 

F ggA11B11 

t Aqgl1B11 

Ft AlgolB11 

+ AllgoB11 

+ A11Bqio11 
F A11Blqiol 
t A11B11qi0 
F A11B1qi11A 
F Al1Bqi,11A 
F Allqi1B11A 
FALIBIIA 


t 
TM 


We are now in the configuration-2. Thus, TM? is called to multiply n by n (TM? is already 
discussed in example 13.10.17). 


13.11 Exercises 


1. Define a TM. Present the formal definition of a TM. 
2. Explain, with a neat diagram, the components of a TM. 
3. Differentiate between FA/PDA vs. TM with respect to: 
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a. tape and head 
b. halt state and final states. 
4. Explain the different ways to describe a TM, with examples. 
5. What is an infinite loop in TM? Explain with an example. 
6. Present the formal definition to the ID of a TM. 
7. Explain, with examples, the different string classes of TM. 
8. Design aTM to accept the language L = {ww*|w € (a + b)*}. 
9. Design a TM to recognise the language of all strings of even length, over {a, b}. 
0. Design aTM that accepts the language of all strings, which contain 101 as a substring 
over {0, 1}. 
11. Design a TM that accepts twice as many 0’s as 1’s, over {0, 1}. 
12. Design a TM to compute the function f(n) = 2n. 
13. Design a TM to compute the function n mod 2. 
14. Construct a TM that will search for and locate a symbol A on its tape (if there is one) 
and then halts. 
15. Present a TM that decides the following languages, over {a, b}: 
(i) @ 
(ii) {a}* 
(iii) {a*ba*b}. 
16. Identify the language accepted by the TM given in figure 13.62. 


= R 


ee 


Figure 13.62. Transition Diagram 
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‘Increment in knowledge results in the enhanced versions of the systems concerned’ 


Introduction 


In the previous chapter, we discussed the more powerful model of computation through 
Turing machine, along with a number of examples pertaining to the cases: 


a. Turing machine as an accepting device for the languages 
b. Turing machine as a computer of functions. 


In this chapter, first we discuss a number of extensions, rather than the enhanced versions 
of the standard TM and later, the languages accepted by a TM and also its properties through 
recursive and recursively enumerable languages. Each of the extensions discussed here 
is no doubt a powerful model, but is equivalent to the standard TM. 


14.1 Extensions of TM 


In the previous chapter, the Turing machine defined is referred to as the Standard turing 
machine, because the tape is assumed to be arbitrarily extensible to the left as well as to the 
right and the machine is supplied with enough tape required for computation. 


bounded No boundary 
on left end on right side 


cell cell 
it+1] i+2 


cell i cellit+n 


Figure 14.1. Tape in a Standard TM 


The cell i, in figure 14.1, indicates that it is the leftmost cell provided to the machine for 
the current computation, but there is no boundary on the right side of the tape. 


Downloaded from https://www.cambridge.org/core. Stockholm University Library, on 06 Dec 2018 at 08:05:27, subject to the Cambridge Core terms of use, available at 
https://www.cambridge.org/core/terms. https://doi.org/10.1017/UPO09788175968363.015 


TM Extensions and Languages 

If, for the standard TM, the following characteristics are considered, then we get different 
types of TM and these are called the extensions of TM. The characteristics are: 

. The tape may be allowed to be infinite in both directions. 


a 
b. There may be more than one head, processing various cells of the tape. 


© 


There may be several tapes, each having its own independent head. 
d. The tape may be k-dimensional, instead of only one-dimensional. 
e. The machine may adopt the nondeterministic approach. 


Accordingly, the following are the extensions of a TM: 


14.1.1 Two-Way Infinite Tape TM 


A TM in which there is an infinite number of sequences of blanks on either side of the tape, 
is. said to be two-way infinite tape TM. The advantage of having this type of TM is that there 
is no possibility of jumping off the left end of the tape. 

The initial configurations of the tape for the input string 101, with respect to a standard 
TM and two-way infinite tape TM, are shown below: 


Figure 14.2. Standard TM with Boundary on Left-Side 


%o 


Figure 14.3. Two-Way TM with No Boundaries 
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Formally, a two-way infinite tape TM is represented as 7-tuple: 


M — (Q, x,T,5, qo, B, h) 


where 
Q is a finite set of states. 
= is the finite set of non-blank symbols. 
T’ is the set of tape characters. 
go € Q is the initial state. 
B is the blank character. 
hC Qis the final state and 
6:Q0xVTtoQxT ~x {L,R,N}. 


14.1.2 Multi-Tape Turing Machine 


ATM with more than one tape and each tape having its own independent head is said to be 
a multi-tape TM. 


Tape-1: 


Tape-2: 


Tape-n: 


I 


Figure 14.4. A Multi-Tape TM 


Initially, the input appears on tape 1 and the others start out blank. The transition 
function is changed to allow for reading, writing, and moving the heads on all the tapes 
simultaneously. 
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Formally, a multi-tape TM is represented as 7-tuple 
M = (Q,2,T,4, 0, B, h) 


where 
Q is the finite set of states. 


= is the finite set of non-blank symbols. 


T’ is the set of tape characters. 
go € Q is the initial state. 
B is the blank character. 


h C Qis the final state and 
8:Q0xT>+QxI* x {L,R} 
(k is the number of tapes). 


The advantage of having multi-tape TM is that the design of some functions like copying, 
reversing, verifying whether a string is a palindrome or not etc., can be carried out in much 
easier way as compared to the design of the corresponding standard TMs. 


14.1.3 Multi-Head TM 


A TM with one tape and several heads on it is said to be a multi-head turing machine. 

In a multi-head TM having k heads (k > 2 — the heads are numbered 1 through k), a 
move of TM depends on the state and on the symbol scanned by each head. In one move, 
each head can move independently to left or right or remain stationary. 

Also, the use of multi-heads (like multi-tapes) can sometimes simplify the construction 
of complex TMs drastically. 


14.1.4 K - Dimensional TM 


A TM having its tape with an infinite k-dimensional grid, is said to be a k-dimensional TM. 

In this type of TM, the tape consists of a k-dimensional array of cells, infinite in all 
2-k directions, for some fixed k. Depending on the state and symbol scanned, the machine 
changes its states, writes a new symbol and moves its tape - head in one of the 2-k directions, 
either positively or negatively along one of the k-axes. Initially, the input is along one axis 
and the head is at the left end of the input. 

At any time, only a finite number of rows in any dimension contain non-blank symbols 
and each of these rows has only a finite number of non-blank symbols. 

Figure 14.5 shows a two-dimensional tape, having boundaries on the left and the bottom. 
Each cell is given an address (x, y) where x is the row number of the cell and y is the column 
number of the cell. 
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0 
0 1 2 3 4 


Figure 14.5. Two Dimensional Tape 


The k-dimensional TMs are much more useful than standard TMs for solving 
sophisticated problems. This type of TM can also be combined with other extensions of 
TM. 


14.1.5 Off Line TM 


An off line TM is a multi-tape TM, whose input tape is of type read-only. The input is 
endmarked by a @ on the left and $ on the right. The TM is not allowed to move the input 
tape head, off the region between ¢ and §. An off line TM can simulate any TM, 7, by using 
one more tape. 

The first step in offline TM is copying its own input onto the extra tape, and then simulating 
T as if the extra tape were T’s input. The offline TM is useful in the case of limiting the 
amount of storage space, which is less than the input length. 


14.1.6 Non-Deterministic TM 


In a standard TM/deterministic-TM, for the current state (q) and the symbol (s) being scanned 
by the tape head, the TM performs the unique action in terms of writing a symbol in the cell 
being scanned and also repositioning the head to left or right or no move (M) at all. 

The transition of TM is represented as: 


5(q,5) = (qi,8i,M;) fori=1. 


In a nondeterministic TM, for the current state (g) and symbol (s) being scanned by the 
tape head, the TM has a finite set of choices for writing a symbol in the cell being scanned 
and repositioning the head to left or right or no move (M). 

The transition of TM is represented as: 

5(q,5) = (qi, 5;,Mi) fori =1,2,... 
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s(q, 5) 


o—S—O:-—O 


(4, 8, m,) 


(a) A Deterministic TM 


© 
(q, s,m) ra 
OOO OO 


s(q, 5) rs O 


OL 0 
C) Ar 
CE Sp ™ O 


(b) A Non deterministic TM 


Figure 14.6. Transitions of a Deterministic and Non-Deterministic TM 


Formally, 
A nondeterminisitic TM is represented as 7-tuple: 


M= (Q, xy, Tr, 64, q0> B, h) 


where 
Q is the finite set of states. 
& is the finite set of non-blank symbols. 
I’ is the set of tape characters. 
go € Qis the initial state. 
B is blank character. 
hC Qis the final state. 
S:Q0xQl—> P(QxT x {L,R}). 


Note-1: 


m The extensions of the TM are a generalisation of the standard TM and they are 
equivalent to (and not more powerful than) the standard TM model of computation. 

m The power of TM is not enhanced by the use of extra heads or extra tapes. 

w A deterministic TM is a special case of a nondeterministic TM and any 
nondeterministic TM has an equivalent deterministic TM. 
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14.2 General Types of TM 


14.2.1 A Random Access TM 


A random access. TM has a fixed number of registers and a one-way infinite type. Each 
register and each tape cell is capable of containing an arbitrary natural number. A sequence 
of instructions, called program, acts on its tape cell and its registers, with a program counter 
that indicates the address of the instruction to be executed next. The operations of random 
access TM will end, when a ‘halt’ instruction is executed. 


R 
E 
G 
I 

s 
wv 
E 
R 
S 


Figure 14.7. Random Access TM 


14.2.2 Universal Turing Machine 


A universal Turing machine is one that is capable of mimicking the action of any other turing 
machine. 

Consider a fixed Turing machine U, with the property that for each and every other TM, 
T; (that computes f(x], ...,%n)), there is a string of symbols Ar that describes the states and 
transitions of Ty, such that the unary number x), ...,X, is written on the tape followed by a 
delimiter symbol and the symbol 47. If U is started in the state Qo on the leftmost symbol 
of AT, f (x1,...,Xn) will be written on the U tape when U stops, where f(x1,...,Xn) is the 
output of T;. 

A universal TM does exactly what a real modern computer does. In other words, a 
universal TM executes some algorithms on an input (x1,...x,) and outputs the results. 
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14.2.3 Alternating Turing Machine 


A nondeterministic TM, which consists of universal states, is said to be an alternating TM. 
The idea is that if all possible transitions from a universal state leads to an acceptance, then 
TM accepts. 


14.2.4 Probabilistic Turing Machine 


A turing machine that contains randomly chosen transitions is a probabilistic TM. 


14.2.5 Oracle Turing Machine 


It is a special kind of TM that consists of an extra tape and 3 additional states qo, gy and qn. 
Upon entering the state go, an oracle is consulted to produce an answer (in a single step) to 
a given input on the tape. If the input is in the oracle set, then the control goes to the state 
Gy, otherwise it goes to gn. 


14.3 Linear Bounded Automata 


A linear bounded automaton (LBA) is a restricted form of a non-deterministic TM. It 
possesses 


a. atape made up of cells, that can contain symbols from a finite alphabet 
b. ahead that can read from or write to one cell on the tape at a time (and can be moved) 
c. a finite number of states. 


It differs from a TM in the sense that (while the tape is initially considered infinite) only a 
finite continuous portion of tape (the length of which is a linear function of the length of the 
initial input) can be accessed by the read/write head. This limitation makes an LBA a more 
accurate model of computers that actually exists, than a TM in some respects. 


Soa saeee 


work space 


| 


Left-end igs 
marker — 


Figure 14.8. LBA: All Computation is Done Between End Markers 
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Formally, the LBA is a TM: 
M = (Q,2,T,5,qo, B,h) 


where 
Q is the set of finite states. 
x is the set of input alphabets which also has two special symbols ‘[’ and ‘]’. 


r is the set of tape symbols. 

é is the transition function from Q x T' to Q x 22*Fx(LR} with two more transitions 
of the form 5(q;, D). = (qj, [, R) and 5(q;, ]) = (q;, ], L) forcing the read/write head 
to be within the boundaries ‘[’ and ‘]’ 

qo is the start state. 

B is a special symbol indicating blank character. 

h € Qis the set of final states. 


The LBAs are same as TM, except that the input string tape space is the only 
tape space allowed to use. The LBA are acceptors for the class of context-sensitive 
languages. 


14.4 Encoding Turing Machines 


There are many ways to encode a turing machine. One of the simplest method is binary 
encoding. The following are the steps required in binary encoding: 


Assume that the Turing machine T consists of {0, 1, B} as the tape alphabets. 
Rename all states in T as qi,..., qj. 

Rename all tape alphabets as x), ... , xj. 

Rename all tape directions as D;, D2 and D3, indicating left, right and no moves. 
Redefine, each transition rule in T as 


5(qj,XL) = (Gx, Xy, Dz) 
jx € (1,2,...,]} 
L,y € {1,2,...,j}. 
z€ {1,2,3}. 


Encode the rules of T as O/|O4|0*|O"|O7. 
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EXAMPLE 14.4.1: Consider a TM to accept r = aba*, whose transitions are shown in 
figure 14.9: 


a/a, L 


a CA wax 
ee te 


Figure 14.9. Transition Diagram for TM to Accept r = aba* 
The encoding of the above TM, using binary encoding, is as follows: 


Step 1: Let the tape alphabet be {0, 1, B}, instead of {a, b, B}. 
Step 2: (G0, 91.92, 93, 95} = {41, 92,93, 94,95» 96}. 

Step 3: {0,1,B} = {x1, x2, x3}. 

Step 4: {L,R,N} = {Dj, Do, D3}. 

Step 5: Redefine each transition rule in T and encode. 


a. (go, B) = (q1,B,L). 
=> 6(q1,%3) = (92,3, D1). 
For qi, j = 1, for x3, L = 3, for q2, x = 2, 
for x3, y = 3 and for Dj, z = 1. 
Thus, using unary number conventions, we have 
0/10" 10*10" 107 
= 0'10710710710! 
=> 01000100100010. 
b. 8(q1,4) = (q2,4,L). 
=> 6(q1,0) = (q2,0,L). 
=> 6(q2,x1) = (q3,*1,D1) 
= 001010001010. 
c. 5(q2,b) = (q3,b,L). 
=> 8(q2, 1) = (43, 1, L). 
=> 8(q3,x2) = (qa, x2,D1) 
=> 0001001000010010 
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d. 5(q3,a) = (q3,a,L). 
=> 5(q3,0) = (q3,0,L). 


=> 5(q4,%1) = (ga, x1, D1). 
=> 000010100001010 


e. 5(q3,B) = (q4,B,N). 


=> 5(q4,x3) = (9s5,x3, D3). 
= 0000100010000010001000 


Thus, the encoded transitions of TM are: 


01000100100010 
001010001010 
0001001000010010 
000010100001010 
0000100010000010001000. 


14.5 TM vs. Real Machines 


a. ATM can compute anything a real computer computes. 


For example: A TM can simulate any type of subroutine found in programming 
languages, including recursive procedures and any of the known parameter-passing 
mechanisms. 


b. ATM has the ability to manipulate an unbounded amount of data. However, given a 
finite amount of time, a TM (like a real machine) can only manipulate a finite amount 
of data. 


c. Like aTM, areal machine can have its storage space enlarged if required, by acquiring 
more disks or other storage media. If the supply of these runs short, the TM may 
become less useful as a model. But the fact is that neither a TM nor a real machine 
needs astronomical amounts of storage space in order to perform useful computations. 
The processing time required usually poses a bigger challenge. 


d. TMs describe algorithms, independent of how much memory they utilise. There is a 
limit to the memory possessed by any current machine, but this limit can rise arbitrarily 
in time. TM allows us to make statements about algorithms, which will hold forever, 
regardless of advances in conventional computing machine architecture. 
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e. Real machines are much more complex than a TM. 


For example, a TM describing an algorithm may have a few hundred states, while 
the equivalent DFA on a given real machine has quadrillions of them. This makes the 
analysis of DFA representation infeasible. 


f. One way in which TMs are poor models for programs is that many real programs, 
such as operating systems and word processors, are written to receive unbounded input 
over time and, therefore, do not halt. In other words, TMs do not model such ongoing 
computation well. However, they can still model portions of it, such as individual 
procedures. 


14.5.1 The Power of TMs 


TMs represent more than an incremental increase in power, over a pushdown automaton. 
A TM can recognise a larger set of languages than any of the machines we have studied so 
far. It can recognise regular and context-free languages and a lot more. Once it recognises a 
language pattern, it can perform some action associated with the pattern. This is the essence 
of a computer’s actions. 


14.6 TM Languages 


ATM halts if the current state is g, current tape symbol is ‘a’ and 5(q, a) is undefined. 

We can make sure that a TM always halts, when it enters a final state, by removing all 
exit transitions. A TM rejecting the string w means that it does not accept w. A TM can 
reject by running forever, without entering a final state. 


This creates an interesting and important problem that it cannot be ascertained whether 
the TM has rejected the input or just has not accepted it yet. Thus for a Turing machine, 
there are two types of languages: 


@ Turing acceptable Language or Recursively Enumerable 
w Turing Decidable Language or Recursive Language 


Before we discuss these languages, we first define the following: 


m@ Decider - A machine that always halts is called a decider. As it always halts, the machine 
is able to decide whether a given string is a member of a formal language, hence the 
name. 


m@ Decision problem - A decision problem is any arbitrary yes - or - no question on an 
infinite set of inputs. 
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For example, the problem ‘given two numbers x and y, does x evenly divide y?’ is a 
decision problem. The answer can be either ‘yes’ or ‘no’ and depends upon the values 
of x and y. 

Decision procedure - Methods used to solve decision problems are called decision 
procedures or algorithms. 

For example, for the problem ‘given two numbers x and y, does x evenly divide y?’, the 
algorithm would explain how to determine whether x evenly divides y, given x and y. 

@ Decidable - A decision problem, which can be solved by some algorithm, is called 
decidable. 

@ Decidable problem/algorithmically solvable/recursively solvable - A decision problem 
that can be solved by an algorithm and halts on all inputs in a finite number of steps, is 
called a decidable problem. 

wu Decidable language - A language for which membership can be decided by an algorithm, 
that halts on all inputs in a finite number of steps - equivalently, it can be recognised 
by a TM that halts for all inputs. 

@ Undecidable - A problem, that cannot be solved for all cases by an algorithm, is called 
an undecidable problem. 

m= Undecidable language - A language for which the membership cannot be decided by 
an algorithm - equivalently, it cannot be recognised by a TM that halts for all inputs. 

a Solvable - A computational problem that can be solved by a TM is called a solvable 
problem. The problem may have a non-binary output. 

= Computable - A function that can be computed by an algorithm is called computable 
(equivalently, it may be computable by a TM). 

a Tractable - A problem which has an algorithm, that computes all instances of it in 
polynomial time, is called a tractable problem. 

g Intractable - A problem, for which no algorithm exists which computes all instances 
of it in polynomial time, is intractable. 


14.6.1 Turing Acceptable/Recursively Enumerable Language 


Definition: A language is said to be recursively enumerable if there exists a TM that will 
halt and accept, when presented with any string w € X”* in the language as input, but may 
either halt and reject or loop forever, when presented with a string not in the language. 

Thus, for Turing acceptable language L, there may be a Turing mahicne TM which halts 
on large number of input strings w, where w € L, but there must be atleast one string w ¢ L 
on which TM does not halt. 

For the recursively enumerable language, a decision problem is called partially 
semi-decidable, if there is a recursively enumerable set. 


556 


Downloaded from https://www.cambridge.org/core. Stockholm University Library, on 06 Dec 2018 at 08:05:27, subject to the Cambridge Core terms of use, available at 
https://www.cambridge.org/core/terms. https://doi.org/10.1017/UP09788175968363.015 


TM Extensions and Languages 


EXAMPLE 14.6.1: Consider a TM that will search for and locate a symbol A on its tape, if 
there is one, and then halt. 


ek 
——e ) 
P/O, L 
A/A,N ‘A/A, N 


Figure 14.10. ATM to Search for Symbol A on its Tape. P is Any Symbol Apart from A 


@ For the input string w = PPAP, the TM operates from left to right searching for ‘A’. 
When it finds ‘A’, it halts. 


@ For the input string w = PPP, the TM will go on forever looking for A i.e., it will not 
halt. This is known as ‘semidecidability’. It can find an answer, if there is one, but does 
not know when to stop if there isn’t any. 

EXAMPLE 14.6.2: Consider a TM that accepts the language aa*, over £ = {a, b}. 


b/b, L 
a/a,R 


Figure 14.11. AM to Accept aa* 


@ For the input string w = aaa, the TM halts and hence w é€ L. 


m For the input string w = aba, the TM loops forever hence w ¢ L. 
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14.6.2 Turing Decidable/Recursive Language 


Definition: A language is said to be recursive, if there exists a TM which will halt and 
accept when presented with any input string W € &*, only if the string is in the language, 
otherwise will halt and reject the string. 

Thus, for Turing decidable language L, there is a Turing machine TM which halts for a 
large number of inputs W belonging to L. 

A TM that always halts is known as a decider or a total turing machine and is said 
to decide the recursive language. The recursive language is also called as recursive set or 
decidable. 


EXAMPLE 14.6.3: The language L = {a"b"c"|n > 0} is turing decidable. 


Note-2: 
m All recursive languages are also recursively enumerable. 


m@ There may be languages which are recursively enumerable but not recursive. 


m Set of all possible words, over the alphabet of the recursive language, is a recursive 
set. 


w Set of all possible words, over the alphabet of the recursive enumerable language, is 
a recursively enumerable set. 


14.6.3 Properties of Recursive and Recursively Enumerable 
Languages 


Theorem I: Recursive languages are closed under complementation. 


Proof. If a language L is recursive, there is a TM, T that accepts it and always halts. 
Thus, L = L(T). 
Now we construct a TM, 7’, such that L’ = L(T’), as follows: 


m@ Make all accepting states of T as non-accepting states in T’. 


m Add a new accepting state in 7’ as h’. 
m Change each non-accepting state of T which is undefined, to go to anew final state h’. 


Thus, if T enters a final state on input W, then 7’ halts without accepting. If T halts 
without accepting, 7” enters a final state. 


=> LL’ = L(T’) 
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Theorem II: If L and L’ are both recursively enumerable, then L and L’ are recursive. 


Proof. Let T; and T2 be two turing machines, accepting languages L and L’ respectively. 
Now construct a TM, T to simulate T; and T> in parallel for the input w. 


m T accepts w, if T, accepts w and halts. 
@ T rejects w, if Tz accepts w. 


Thus L = L(T) is recursive and since the complement of a recursive language is recursive, 
L’ is also recursive. 


Theorem III: Union of two recursive languages is recursive. 


Proof. Let T; and Tz be two Turing machines, accepting languages L; and L2 
respectively. Now construct a TM, 7 to simulate 7; and 7». 


m If 7, rejects, the TM, T simulates T> and accepts, iff T, accepts and finally halts. 


Since T always halts, T accepts L; U Lp. 


Theorem IV: Recursively enumerable languages are closed under union. 


Proof. Let T; and T2 be two Turing machines, accepting languages L; and L2 
respectively. Now construct a TM, T to simulate 7; and T> in parallel. 


mw T accepts W, iff either 7; or T2 accepts W and halts. Thus, T accepts Ly U Lp. 


We discuss recursive and recursively enumerable language further, in chapter 15. 


14.7 Undecidable Problem 


An undecidable problem is a problem whose language is not a recursive one. Such problems 
cannot be solved by computers, in general. The following is the list of undecidable problems: 


= The halting problem 
It is a decision problem which can be informally stated as: 


‘Given a description of a program and a finite input, decide whether the program 
finishes running or will run forever’. 


Formally, it is stated as 
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‘Given an arbitrary machine M and a string W, does M halt with W as the 
input string?’ 


The halting problem is undecidable over TM. 


m= Rice’s theorem 
It states that all non-trivial properties of computer programs are undecidable. 
m Determine, if a context-free grammar generates all possible strings or whether it is 
ambiguous. 
Given two context-free grammars, determine whether they generate the same set of 
strings or whether one generates a subset of strings generated by the other or whether 
there is any string at all that is generated by both. 
The post correspondence problem. 
The word problem for groups. 
The word problem for certain formal languages. 
Determine whether two finite simplicial complexes are homeomorphic. 
Determine whether the fundamental group of finite simplicial complex is trivial. 


14.8 P and NP Classes of Languages 


A class of languages is a set of languages that share some characteristic features. Since a 
language is a set of strings from a finite alphabet, a class of languages is a set of sets. 

The language class P is the set of languages for which there exists a deterministic TM 
that accepts each language in a number of transitions, bounded by a fixed polynomial in the 
length of the input string. 

The language class NP is the set of languages for which there exists a nondeterministic 
finite state control and which have a TM that does not have to reject a string in any prescribed 
number of moves. 


14.8.1 NP-Completeness 


A decision problem C is NP-complete, if 
a. Itis in NP 
b. It is NP-hard i.e., every other problem in NP is reducible to it. 


By ‘reducible’ it means that, for every problem L, there is a polynomial-time many-one 
reduction, a deterministic algorithm which transforms instances / and L into instances c and 
L, such that the answer to c is YES, iff the answer to / is YES. 
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The list below shows some well-known problems that are NP-complete, when expressed 
as decision problems: 


Boolean satisfiability problem 
N-puzzle 

Knapsack problem 
Hamiltonian cycle problem 
Travelling salesman problem 
Subgraph isomorphism problem 
Subset sum problem 

Clique problem 

Vertex cover problem 
Independent set problem 
Graph colouring problem 


14.9 Church Turing Thesis 


The thesis was first proposed by Stephen C. Kleene in 1943, but named after Alonzo Church 
and Alan Turing. 

It can be stated as: 

‘Every ‘function which would naturally be regarded as computable, can be computed by 
a Turing machine’. 

Due to the vagueness of the concept of a ‘function which naturally be regarded as 
computable’, the thesis cannot be proven formally. Disproof would be possible only 
if humanity found ways of building hypercomputers whose results should ‘naturally be 
regarded as computable’. 

Any non-interactive computer program can be translated into a Turing machine, and any 
Turing machine can be translated into any Turing complete programming language. So the 
thesis is equivalent to saying that any Turing-complete programming language is sufficient 
to express any algorithm. 

Variations of the thesis exist — for example, the Physical Church-Turing thesis (PCTT) 
states: 

‘Every function that can be physically computed can be computed by a Turing machine’. 

Another variation is the Strong Church-Turing Thesis (SCTT), which is not due to 
Church or Turing, but was rather realised gradually in the development of complexity 
theory. It states: 

‘Any ‘reasonable’ model of computation can be efficiently simulated on a probabilistic 
Turing machine. 
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The word ‘efficiently’ here means upto polynomial-time reductions. The Strong Church- 
Turing Thesis, then, posits that all ‘reasonable’ models of computation, yield the same 
class of problems that can be computed in polynomial time. Assuming the conjecture that 
probabilistic polynomial time (BPP) equals deterministic polynomial time (P), the word 
‘probabilistic’ is optional in the Strong Church-Turing Thesis. 


14.10 Exercises 


1. Explain the follwing variations of TM: 
a. Multi-head 
b. K-dimensional 
c. offline 
Present the formal definition of a non-deterministic TM. 
Construct a non-deterministic TM to accept the language {0"1” :n > 1,m > 1}. 
Present the formal definition of a two-way infinite tape TM. 
Design a two-tape TM to convert the input #w# into #ww*#. 
Explain the following types of TM. 
a. Universal 
b. Random Access 
c. Oracle 
7. Define linear bounded automaton. Present its formal definition. 
8. Given the general procedure to encode a TM. 
9. Encode aTM that accepts the language of all strings, which contain aba as a substring. 
10. Compare TM with real machines. 
11. Define recursively enumerable and recursive languages, with examples. 
12. Prove that recursive languages are closed under complementation. 
13. Prove that recursively enumerable languages are closed under union. 
14. List the different undecidable problems in the theory of computation. 
15. Explain briefly the Church Turing thesis. 


NWP Oy 
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Formal Languages / Grammar 
Hierarchy 


‘Important indeed is to observe the correspondence between machines accepting 
a language and grammars generating a language.’ 


Introduction 


In the previous chapters, we discussed languages from different perspectives: 


a. Languages accepted by finite automata viz., regular languages. 

b. Languages accepted by Turing machine viz., recursive and recursively enumerable 
languages 

c. Languages generated by formal grammars viz., context-free languages. 


In this chapter, first we shall have an idea about what a grammar and its classification is (in 
the formal sense), then we shall briefly study the formal languages and their types (under 
chomskey’s hierarchy). Finally, we discuss the relationship between the types of grammars 
and languages. 


15.1 Formal Grammar 


As discussed earlier, a formal grammar (or simply grammar) is a precise description of a 
formal language. In other words, a grammar is a notation for defining a language through a 
finite number of rules. 


Two main categories of formal grammar are: 


m Generative grammars — are sets of rules for generation of strings in a language. 
m Analytic grammars — are sets of rules to determine whether a string is a member of the 
language. 


In other words, an analytic grammar describes how to recognise the strings which are 
members of a set, whereas a generative grammar describes how to write only those strings 
in the set. 
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| 15.2 Generative Grammars | 


A generative grammar consists of a set of rules for transforming strings. To generate a string 
in the language, one begins with a string consisting of only a single start symbol and then 
applies the rules successively (any number of times) to rewrite the string. The language 
consists of all the strings that can be generated in this manner. Any particular sequence 
of legal choices, taken~during this rewriting process, yields one particular string in the 
language. If there are multiple ways of generating a single string, then the grammar is said 
to be ambiguous. 


15.2.1 Components of a Generative Grammar 
The generative grammar G consists of the following components: 


A finite set V of non-terminal symbols. 

A finite set T of terminal symbols that is disjoint from V. 
A finite set P of production rules 

A distinguished symbol S € V i.e. the start symbol. 


oo oF PS 


15.2.2 Quad-Tuple Specification of Generative Grammar 


The formal definition of generative grammars was first proposed by Noam Chomsky in 
1950s. 


Formally, a generative grammar G is a quad-tuple 
G=(V,T,P,S) 


where, 
V is a finite set of non-terminals 
T is a finite set of terminals 
P is the finite set of production rules, each of the form 


(TUV)*V(T UV)* > (TUV)* 


i.e., each production rule maps form one string of symbols to another, where the 
first string contains atleast one non-terminal symbol. 
S is the start symbol, S € V. 
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The language of a generative grammar G, denoted by L(G), is defined as all those strings 
over T, that can be generated by starting with the start symbol S and then applying the 
production rules in P until no more non-terminal symbols are present. 


15.3 Types of Generative Grammar 


In 1956, Noam Chomsky classified the generative grammars into types known as the 
Chomsky hierarchy. Each type of generative grammar distinguishes the other types in 
the form of production rules. The following are the types: 


15.3.1 Unrestricted or Phrase-Structure Grammars 


An unrestricted grammar is a formal grammar, on which no restrictions are made on the left 
and right sides of the grammar’s productions. 


Formally, unrestricted grammar G is a quad-tuple 
G=(V,T,P,S) 


where, V, 7, S are same as in the case of generative grammar and P is the finite set of 
production rules, each of the form 


(VUT)t + (VUT)* 


i.e., no € on the left-hand side of any productions. However, € can appear on the 
right-hand side of the productions. 


EXAMPLE 15.3.1: Let G = (V,T, P,S) be unrestricted grammar whose production rules 
are 


P={S —> AS|€ 
A-— aAla 
aaaA — ba}. 


The derivation for w = bbaa is as follows: 


S => AS 
=> AAS 
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=> AAAS 
= AAA 

=> aAAA 
=> aaAAA 
= aaaAAA 
= baAA 
=> baaAA 
=> baaaAA 
=> bbaA 
=> bbaa. 


In Chomsky’s Hierarchy, an unrestricted grammar is called type 0 grammar. The language 
generated by this grammar is recursively enumerable language and the recognisers are 
Turing machines. In other words, for every unrestricted grammar G, there exists some TM 
capable of recognising L(G) and vice versa. 


EXAMPLE 15.3.2: Recall that the language {a"b"a"|n > 0} is not a context-free language. 
This language can be generated with an unrestricted grammar: 


S — aSBA|abA 
AB — BA 
bB — bb 
bA — ba 


aA.—> aa. 


EXAMPLE 15.3.3: {wwlw € {a,b}, {a b"|n > 0} and {a™|n > 0} are some of the 
languages, that are generated by unrestricted grammar. 


15.3.2 Context-Sensitive Grammar 


A context-sensitive grammar is a formal grammar, in which the left-hand sides and right-hand 
sides of any production rule may be surrounded by a context of terminal and non-terminal 
symbols. 
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Formally, a context sensitive grammar G is a quad-tuple 
G=(V,T,P,S) 


where, V,7,S are same as in the case of generative grammar and P is the finite set of 
productions, each of the form 


(VUT)t > (VUT)t 


i.e., no € on the left and right-hand sides of any productions. Hence, this grammar is 
€-free. 


EXAMPLE 15.3.4: Let G = (V,T, P,S) be a context-sensitive grammar whose production 
rules are 


P={ S—aBb 
aB — bBB 
bB > aa 

} Bob. 
The derivation for w = aaabb is as follows: 
S => aBb 
=> bBBb 
=> aaBb 
=> abBBb 
=> aaaBb 
=> aaabb. 


In chomsky’s hierarchy, context-sensitive grammar is called type 1 grammar. The language 
generated by this grammar is context-sensitive language and the recogniser is LBA. 


EXAMPLE 15.3.5: The language {a”b"c”|n > 1} is a context-sensitive language. This 
language can be generated with a context-sensitive grammar: 


S — abc\|aSBC 
CB > BC 
bB — bb. 
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EXAMPLE 15.3.6: The language {a? : pis a prime number} is generated with context- 
sensitive grammar. 


15.3.3 Context-Free Grammar 


A context-free grammar is a grammar in which the left-hand side of each production rule 
consists of only a single non-terminal symbol. This restriction does not make all languages 
generate context-free grammar. 


Formally, a context free grammar G is a quad-tuple 


G=(V,T,P,S) 


where, V,7,S are same as in the case of generative grammar and p is the finite set of 
productions, each of the form 
A—> (VUT)* 


i.e., a single non-terminal symbol (A in this case) on left-hand side and any number of 
terminals and non-terminals (including €) on the right-hand side of production. 


EXAMPLE 15.3.7: Let G = (V,T, P,S) be acontext-free grammar whose production rules 
are: 
p={ 
S — aSa 
S — bSb 
S—albleé 
i" 


The derivation for w = ababa is as follows: 


S => aSa 
=> abSba 
=> ababa 


In chomsky’s hierarchy, context-free grammar is called type 2 grammar. The language 
generated by this grammar is context-free language and the corresponding recogniser is 
PDA. 
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EXAMPLE 15.3.8: The language {a"b"|n > 1} is context-free language and this language 
can be generated with a context-free grammar: 

S — aSb 

S — ab. 


EXAMPLE 15.3.9: The language {ww®|w € {a,b}*} is generated with a context-free 
grammar. 


15.3.4 Regular Grammar 


Recall that a grammar is said to be regular, iff it is right-linear or left-linear. Informally, a 
regular grammar is a formal grammar with the restrictions on both left and right-hand sides 
of the productions. 
@ Left-hand side of each production rule consists of only a single non-terminal symbol. 
m Right-hand side of each production rule may be nothing, or a single terminal symbol, or 
a single terminal symbol followed by a non-terminal symbol, but nothing else. (Longer 
strings of terminals or single terminals without anything else are also allowed.) 


Formally, a regular grammar G is a quad-tuple 
G = (V,T,P,S) 


where, V,7,S are same as that for generative grammar and p is the finite set of 


productions, each of the form 


A->torA— tBorA-E€E 


where A,B,é€ V andt eT. 


EXAMPLE 15.3.10: Let G = (V,T, P,S) be aregular grammar whose production rules are 


p={ 
S—>aS 
S->a 
}. 
The derivation for w = aaa is as follows: 
S=> aS 
=> aaS 
=> daa. 
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In chomsky’s hierarchy, regular grammar is called a type 3 grammar. The language 
generated by this grammar is a regular language and the corresponding recogniser is FA. 


EXAMPLE 15.3.11: The language {a"b”|m,n > 1} is a regular language and this language 
can be generated with the regular grammar: 


S—aA 
A—>aA 
A— bB 
B— bB 
Boe. 


Regular grammars are commonly expressed using regular expressions. 
The table 15.1 summarises different types of generative grammars, discussed so far. 


Chomsky’s Hierarchy) Grammar Restrictions on Acceptor 
vince productions 


(VUT)* — (VUT)* T™ 
(VUT)+ — (VUT)t LBA 


0 Unrestricted 


Context-sensitive 


2 Context-free {A — (VUT)*, PDA 
A is a single non-terminal on 
left-hand side of productions. 

3 Regular A—torA — tBor FA 


A->éE 
where A, B,€ V andt € T 


Table 15.1 Generative Grammars 


15.3.5 Other Forms of Generative Grammar 


Some extensions and variations on Chomsky’s original hierarchy of formal grammars have 
been developed in recent years. Some forms include: 


a. Tree-adjoining grammars 
This type of grammar allows us to rewrite the rules, to operate on parse trees instead 
of just strings. 

b. Affix grammars and attribute grammars 
This type of generative grammar allows us to rewrite rules, to be augmented with 
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semantic attributes and operations. This is useful in both increasing expressiveness of 
the grammar and for constructing practical translation tools for language. 


15.4 Analytic Grammars 


Analytic grammars are used to design a parser for the language. Examples of analytic 
grammars include the following: 


a. Top-down parsing language: a highly minimalist analytic grammar formalism, 
developed in the early 1970s, to study the behavior of top-down parsers. 

b. Link grammars: a form of analytic grammar, designed for linguistics, which derives 
syntactic structure by examining the positional relationships between pairs of words. 

c. Parsing expression grammars: a more recent generalisation of top-down parsing 
language, designed around the practical expressiveness needs of programming 
language and compiler writers. 


15.5 Formal Languages 


As discussed in chapter 1, a formal language is a set of finite-length words, drawn from 
some finite alphabet. In the following sections, we briefly discuss the following types of 
formal languages and their properties. 


@ Regular language 

@ Context-free language 

@ Recursive language 

@ Recursively enumerable language 
g Deterministic context-free language 
m Indexed language 


15.6 Regular Language 


A regular language is a formal language, that satisfies the following equivalent properties: 


m It can be accepted by a DFA. 
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It can be accepted by an NFA. 

It can be described by a regular expression. 

It can be generated by a regular grammar. 

It can be accepted by a read-only TM. (2DFA is also known as read-only TM). 


Closure Properties 
The regular languages are closed under the following operations: 


If L, Ly and Lz are regular languages, then 


the complementation L = D* — L is also regular. 
the Kleene closure L* of L is also regular. 

the union L; U Lz is also regular. 

the intersection L; M Lz is regular. 

the difference L; — Lz is regular. 

the concatenation L 0 Ly is regular. 

the reverse L¥ of L is also regular. 


Note-1: All finite languages are regular (those containing only a finite number of words). 


Examples of Regular Languages 


a. the language, consisting of all strings over the alphabet {a, b}, that accepts even number 
of a’s. 

b. the language, consisting of all strings over the alphabet {0, 1}, that accepts odd number 
of 1’s and even number of 0’s 

c. the language, consisting of all strings over & = {0, 1}, that contains set of strings of 
0’s and 1’s except those containing substring 001. 


To check whether a language is regular 


To prove that a given language is not regular, one uses 


@ Pumping Lemma, or 
m Myhill-Nerode theorem 


we have already studied Pumping Lemma in chapter 12. In this section, we briefly explain 
the Myhill-Nerode theorem. 
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Theorem 1 [MYHILL-NERODE THEOREM] 

The Myhill-Nerode theorem provides a necessary and sufficient condition for a language 
to be regualar. It is almost exclusively used in order to prove that a given language is not 
regular. 

The theorem is named after John Myhill and Anil Nerode, who proved it at the university 
of Chicago in 1958. 


Statement: The Myhill-Nerode theorem states that the number of states in the smallest 
automaton, accepting L, is equal to the number of equivalence classes in R,. The intuition 
is that if one starts with such a minimal automaton, then any strings x and y that drive it 
to the same state will be in the same equivalence class. Also, if one starts with a partition 


into equivalence classes, one can easily construct an automaton that uses its state to keep 
track of the equivalence class containing the part of the string seen so far. 


Corollary: If a language defines an infinite set of equivalence classes, it is not regular. 

This corollary is frequently used to prove that a language is non-regular. 

Equivalence classes: Given a language L, define a relation Rz, on strings by the rule 
xRLy, 


if there is no distinguishing extension z with the property that exactly one of the strings 
xz and yz is in L. Thus, Ry is an equivalence relation on strings and it divides the set of 
all finite strings into one or more equivalence classes. 


EXAMPLE 15.6.1: Consider a regular language to accept those strings of binary numbers, 
that are divisible by three, over the alphabet & = {0,1}. There exist three equivalence 
classes of strings for this: 


@ anumbers that give 0 as a remainder. 
@ b numbers that give 1 as a remainder. 
@® c numbers that give 2 as a remainder. 


Thus the minimal automaton, accepting this language, would have three states corresponding 
to the equivalence classes, as shown in figure 15.1. 


sectactty 


Figure 15.1. The Minimal Automaton to Accept L = {11, 110, 1001,...} 
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John R. Myhill was a mathematician. He received his Ph.D. from Harvard 
University under Willard Van Orman Quine in 1949. He was a professor at SUNY 
Buffalo from 1966 until his death in 1987. He also taught at several other universities. 

Anil Nerode is a U.S. mathematician. He received his Ph.D. in mathematics from 
the University of Chicago, under Saunders Mac Lane, and is presently Goldwin Smith 
Professor of Mathematics at Cornell University. 

His interests are in a mathematical logic, the theory of automata, computability 
and complexity theory, the calculus of variations, and disturbed systems. 

With John Myhill, Nerode proved the Myhill-Nerode theorem, specifying necessary 
and sufficient conditions for a formal language to be regular. 


15.7 Context-Free Language 


There exist the following equivalent definitions for the concept of context-free language. 


a. A formal language, that is generated by a context-free grammar, is a context-free 
language. 

b. Context-free languages are those that are identical to the set of languages, accepted by 
PDA. 


Closure Properties 


Context-free languages are closed under the following operations. If L, L; and Lz are the 
context-free languages and R is the regular language, then: 


@ the kleene closure L* of L is also a context-free language 

m@ the cancatenation L; o Ly is a context-free language 

m@ the union L; U LZ» is context-free 

m the intersection of a context-free language L and a regular language R is always 
context-free i.e., L 1 R is context-free 

m the reverse L¥ of L is also context-free. 


15.7.1 Properties of Context-Free Languages 


@ An alternative and equivalent definition of context-free languages employs non- 


574 


Downloaded from https://www.cambridge.org/core. Stockholm University Library, on 06 Dec 2018 at 08:07:29, subject to the Cambridge Core terms of use, available at 
https://www.cambridge.org/core/terms. https://doi.org/10.1017/UPO9788175968363.016 


Formal Languages / Grammar Hierarchy 


deterministic PDA: a language is context-free, if and only if, it can be accepted by 
such an automaton. 

@ A language can also be modelled as a set of all sequences of terminals, which are 
accepted by the grammar. This model is helpful in understanding set operations an 
languages. 

m Context-free languages are not closed under complementation, intersection or 
difference. 


Examples of Context-Free Languages 


a. The language of palindrome, over the alphabet & = {a, b,c}, is acontext-free language. 
b. The language L = {w|ng(w) = np(w)} is a context-free language. 
c. The language L = {a"b"|n > 0} is a context-free language. 


Note-2: To check whether a language is context-free 
To prove that a given language is not context-free, one may employ the 
pumping lemma for the context-free languages. 


15.8 Context-Sensitive Language 


A formal language, that can be described by a context-sensitive grammar, is called a 
context-sensitive language. 


15.8.1 Closure Properties 


@ The union, intersection, and concatenation of two context-sensitive languages are all 
context-sensitive. 
= The complement of a context-sensitive language is also context-sensitive. 


15.8.2 Computational Properties 


Computationally, the context-sensitive languages are equivalent to linear, bounded non- 
deterministic TMs (i.e., a non-deterministic TM with a tape of C,, cells, where n is the size 
of the input and C is a constant associated with the machine). This means that every formal 
language, that can be decided by such a machine, is a context-sensitive language. Also, 
every context-sensitive language can be decided by such a machine. 
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Example of Context-Sensitive Language 
@ The language L = {a"b"c" : n > 1} is context-sensitive. 
m The language L = {a"b"c"d" : n > 1} is context-sensitive. 
a L = {a? : pisprimenumber} is context-sensitive. 


Note-3: Normal forms: 
Every context-sensitive grammar which does not generate the empty string can 
be transformed into an equivalent one in Kuroda normal form. 


15.9 Recursive Language 


The following are the equivalent definitions for the concept of a recursive language: 


a. A recursive formal language is a recursive subset in the set of all possible words, over 
the alphabet of the language. 
b. A recursive language is a formal language, for which there exists a TM which, when 
presented with any input string 
@ halts and accepts if the string is in the language 
@ halts and rejects otherwise. 


15.9.1 Closure Properties 


Recursive language are closed under the follwing operations. If L, ZL; and Lz are recursive 
languages, then 


m the kleene closure L* of L is recursive 

the concatenation L; o Lz is recursive 

the union L, U Lz is recursive 

the intersection L; NM Lz is recursive 

the complement L = &* — L is also recursive. 


Note-4: The recursive language was not defined in the chomsky hierarchy. 


15.10 Recursively Enumerable Language 


There exist three equivalent definitions for the concept of a recursively enumerable language. 
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a. A recursively enumerable formal language is a recursively enumerable subset in the 
set of all possible words, over the alphabet of the language. 

b. A recursively enumerable language is a formal language, for which there exists a TM 
which will enumerate all valid strings of the language. 

c. Arecursively enumerable language is a formal language for which there exists a TM, 
that will halt and accept, when presented with any string in the language as input, but 
may halt and reject or loop forever, when presented with a string not in the language. 


15.10.1 Closure Properties 


Recursively enumerable languages are closed under the following operations. If L, L; and 
Ly are recursively enumerable language, then: 


m the kleene closure L* of L is recursively enumerable 
m@ the concatenation L; o Lz is recursively enumerable 
@ the union L; U Ly is recursively enumerable 

@ the intersection L; N Lz is recursively enumerable. 


Recursively enumerable language are not closed under set difference and complementation. 


15.10.2 Recursive vs. Recursively Enumerable Languages 


m Recursively enumerable languages are accepted, when a TM enters a final state and 
rejected otherwise (even if TM loops). 
Recursive languages are accepted by a TM that always halts, even when it rejects the 
input. 

@ Recursively enumerable languages are known as type 0 language in the chomsky’s 
hierarchy of final languages. 

@ The acceptors of recursively enumerable languages are TMs. 
The corresponding acceptors of recursive languages are TMs that never loop. 


15.11 Other Forms of Formal Languages 


15,11.1 Indexed Language 


An indexed language is a formal language, discovered by Alfred Aho, which is a proper 
subset of context-sensitive languages and a proper superset of context-free languages. 
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An indexed language is minimally characterised by an indexed grammar, or by a nested 
stack automaton. An indexed grammar may have a stack attached to non-terminals, which 
gets copied to sub-nonterminals. A nested stack automaton may read its stack, in addition 
to pushing or popping it. Also, a stack may nest other stacks inside it. 


The indexed language is not defined in chomsky hierarchy. 


15.11.2 Deterministic Context-Free Language 


It is a formal language that is a proper subset of languages defined by context-free grammars. 

The set of deterministic context-free languages is identical to the set of languages accepted 

by a deterministic PDA. These type of languages are not defined in chomsky hierarchy. 
The table below summarises some of the formal languages: 


{Chomsky | Type of | Language Corresponding Does —_non-| Set is closed 
hierarchy | language defined by Acceptor determinism under 
give more 
i= power? 
3 Regular ‘Regular Finite No | Union, 
language expression automaton Concatenation, 
Kleene star 
intersection, 
complement 
Context-free |context-free | Pushdown Concatenation, 
language grammar automaton kleene star, 
union 
1 Context- context- Linear- Yes Union, 
sensitive sensitive bounded intersection, 
language grammar automaton concatenation] 
complement 
Not Recursive No grammar) TM that never No concatenation, 
defined | language has been loops kleene star, 
characterised union, 
by recursive intersection, 
languages so complement 
far 
0 Recursively | Unrestricted {TM No Concatenation, 
enumerable | grammar kleene star, 
language union, 
intersection 


Table 15.2 Formal Languages 
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15.12 Relationship between Grammars and Languages 


Every regular language is a context-free language because it can be described by a 
regular grammar. 

Every context-free language is not a regular language. 

Example: L = {a"b"|n > 0} is context-free but not regular. 

Every context-free language is context-sensitive language. 

Every context-sensitive language is not context-free. 

Example: L = {a"b"c"|n > 0} is context-sensitive but not context-free. 

Every context-free language is recursive. 

Every recursive language is not context-free. 

Example: L = {a"b"c"|n > 0} is recursive but not context-free. 

Every context-sensitive language is recursive. 

There is a recursive language which is not context-sensitive. Moreover, a recursive 
language, containing null string, cannot be a context-sensitive language. 

The problem of a context-sensitive grammar describing a context-free language is 
undecidable. 

All recursive languages are also recursively enumerable, regular, context-free and 
context-sensitive language. 

All regular languages, context-free, context-sensitive and recursive languages are 
recursively enumerable languages. 

Each recursive language is necessarily a recursively enumerable language but converse 
need not be true. 


Figure 15.2 shows the set-superset relationship between types of grammars/languages. 
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Figure 15.2. The Chomsky Language Hierarchy 
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15.13 Exercises 


Explain the two main categories of formal grammar. 
Present the formal definition to phrase-structure grammar. 
Show that L = {a”'|n > 0} can be generated with unrestricted grammar. 
Explain context-sensitive grammar with examples. 
Show that L = {a?|P is a prime number} is generated with context-sensitive grammar. 
Present the formal definition of regular grammar. 
Give the regular grammar for the language L = {0”|n > 1}, over T = {0}. 
State Myhill-Nerode theorem. 
List the closure properties of the following languages: 
a. regular 
b. context-free 
c. recursive. 
10. Explain the following: 
a. indexed language 
b. deterministic context-free language. 
11. Show that the language L = {a"b"c"|n > 0} is not context-free but is Turing decidable. 
12. Explain the relationship of formal languages and formal grammar in Chomsky 
hierarchy. 


CANAAN P LYNE 


580 


Downloaded from https://www.cambridge.org/core. Stockholm University Library, on 06 Dec 2018 at 08:07:29, subject to the Cambridge Core terms of use, available at 
https://www.cambridge.org/core/terms. https://doi.org/10.1017/UPO09788175968363.016 


Appendix A: Symbol Chart 


Greek Letter Symbols 


alpha 
beta 
gamma 
delta 
epsilon 


zeta 
eta 


BSwr~n nN Ow DR 


Gamma 
Delta 
Theta 


@b'1 


varepsilon 


wer ranged 


= o) > 


theta 
vartheta 
gamma 
kappa 
lambda 
mu 

nu 

xi 


Lambda 
Xi 
Pi 


Binary Operation Symbols 


pm 
mp 
times 
+ div 
* ast 
x star 
O° 
e 


x HH 


circ 

bullet 

cdot 
+ plus 


ae > Kg Ee Ae cD 


cap 
cup 
uplus 
sqcap 
sqcup 
vee 
wedge 


setminus 


wr 
minus 


VIAVAVAAIBS 


) 

pi 

varpi 

rho 
varrho 
sigma 
varsigma 


NADDBGAS 


Sigma 
Upsilon 
Phi 


eM 


diamond 
bigtriangleup 
bigtriangledown 
triangleleft 
triangleright 

Ihd 

thd 

unlhd 

unrhd 


eernreseeca 


vn € 


c+#+O08@06 


tau 
upsilon 
phi 
varphi 
chi 

psi 
omega 


Psi 
Omega 


oplus 
ominus 
otimes 
oslash 
odot 
bigcirc 
dagger 
ddagger 
amalg 
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Relation Symbols 
< leq > geq = equiv 
< prec > succ ~ sim 
<= preceq >= succeq ~  simeq 
<«< il > egg < asymp 
Cc subset > __ supset = approx 
€ _ subseteq > __supseteq = cong 
C  sqsubset —]__ sqsupset #~ neq 
C sqsubseteq =o sqsupseteq = doteq 
€ in > ni  propto 
F  vdash 4 = dashv <  lessthan(L) 
Miscellaneous Symbols 
... ldots -++ cdots :  vdots 
® — aleph / prime V forall 
h  hbar emptyset J exists 
l imath V___onabla — neg 
J jmath / © surd b flat 
£ ell T ~~ top f natural 
(2 ~=wp tL bot f sharp 
R Re | | \ backslash 
3 Im Z angie ds partial 
OS mho dot | pipe 
Arrow Symbols 
< = leftarrow <—  longleftarrow 
<= Leftarrow <= Longleftarrow 
— rightarrow —-  longrightarrow 
=  Rightarrow => __ Longrightarrow 
< — leftrightarrow <— _ longleftrightarrow 
© = Leftrightarrow <=> Longleftrightarrow 
> mapsto +— longmapsto 
< hookleftarrow <>  hookrightarrow 
+ — leftharpoonup = rightharpoonup 
~~  leftharpoondown = rightharpoondown 
<—  rightleftharpoons ~~ leadsto 
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Pol ee eee 


models 

perp 

mid 

parallel 
bowtie 

Join 

smile 

frown 
equal(eq) 
greaterthan(g) 


)(C XT 


V 


ddots 

infty 

Box 
Diamond 
triangle 
clubsuit 
diamondsuit 
heartsuit 
spadesuit 


uparrow 
Uparrow 
downarrow 
Downarrow 
updownarrow 
Updownarrow 
nearrow 
searrow 
swalrow 
nwarrow 
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Symbol Chart 


Variable-Sized Symbols 


>> sum (\  bigcap ©) bigodot 
[] prod LU _ bigcup ®) _ bigotimes 
[| coprod || bigsqcup @__bigoplus 
f int \V_ bigvee +4] biguplus 
f  oint /\_ bigwedge 
Punctuation Symbols 
, comma, ; semicolon : colon . Idotp - ¢edotp 
Delimiters 
( ¢ ) ) + uparrow ft Uparrow 
[ [ ] J { downarrow 4} Downarrow 
{ { } } ¢ updownarrow ¢ Updownarrow 
| lIfloor J rfloor [  Iceil ] rceil 
( langle ) rangle / / \ backslash 
| | l| | 
Large Delimiters 
| rmoustache | Imoustache rgroup | lgroup 
| arrowvert | Arrowvert | bracevert 
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Appendix B: Sets, Relations, Functions and 
Mathematical Induction 


B.1.1 What is a set? 


Set is a group of elements, having a property that characterises those elements. 
Points to remember: 
a. Objects of set are called the elements or members of the set. 
b. Sets are denoted by capital letters. 


c. Setisa well-defined collection of objects which means that given an object, we should 
be able to say whether the object is an element of the given set or not. Secondly, all 
the elements of the set must be distinct. This means that if an element is repeated, it 
is not counted more than once. 


d. The order in which the elements are enumerated is immaterial. The concept of a set 
is so fundamental that it unifies mathematics and its cognates. 


B.1.2 How to specify a set? 


One way is to enumerate the elements completely. All the elements belonging to the set are 
explicitly given. This method of representing sets is called roster method. 


EXAMPLE 1: A = {1, 2, 3, 4, 5} 
Another way is to give the properties that characterise the elements of the set. This is called 
rule method. 


EXAMPLE 2: B = {x |x is a positive integer less than or equal to 5} 
Some sets can also be defined recursively. 


A recursive definition of a set S consists of three clauses: 


gw The basis clause explicitly lists at least one primitive element in S, ensuring that S is 
nonempty. 
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@ The recursive clause provides a systematic recipe to generate new elements from known 
elements. 

w@ The terminal clause guarantees that the first two clauses are the only ways through which 
the elements of S are generated. (This clause is generally omitted for convenience). 


EXAMPLE 3: Let the set S be recursively defined as follows: 3 € S and if X € S, then 
3* € S. That is, S = (3,3°,---}- 


EXAMPLE 4: S = {b;|b; = 5,bj4; = b; + 5}, where b; <= 20 is the set of integers less 
than or equal to 25 and which are divisible by 5. 


B.1.3 Set terminology 


a. Belongs To 
The phrase ‘is an element of’ or ‘belongs to’ or ‘is a member of’ is denoted by the 
symbol ¢. 


EXAMPLE 5: We write a € B,e € B,3 € Cete. 
When an object is not a member of a set, it is denoted by the symbol, which means 
‘does not belong to’ or ‘is not a member of’. 


EXAMPLE 6: We write a ¢ B,e ¢ B, b € B; where B = {b,c}. 


b. Finite and Infinite Sets 
If the number of elements in a set can be counted (i.e., a definite number) then the set 
is called a finite set. If it is not possible to count the number of elements of a set, the 
set is called an infinite set. In other words, a set that is not finite is infinite. 


c. Null/ Empty/ Void Set 
A set which has no elements is called null set and is denoted by @. 


d. Equal Sets 
Two sets A and B are said to be equal, if they have the same elements and have the 
same number of elements. A is equal to B is represented as A = B. 


EXAMPLE 7: If A = {x, y, z}, B= {y, x, z}, C= {z, x, y}, then A=B=C. 


e. Equivalent Sets 
Two sets A and B are said to be equivalent, if they have the same number of elements. 
A is equivalent to B is represented as A ~ B. 


EXAMPLE 8: If A = {x, y, z, w}, B = {1, 2, 3, 4}, then A and B are equivalent. 
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f. Cardinality 
If a set S has n distinct elements for some natural number n, then n is the cardinality 
(size) of S and S is a finite set. The cardinality of S is denoted by |S}. 


EXAMPLE 9: The cardinality of the set {3, 1, 2} is 3. 


g. Singleton 
A set is said to be singleton, if it has only one element. 


EXAMPLE 10: {0}, {a}, {4/5} are singleton sets. 
h. Subset 
Let A and B be two sets. 
A is a subset of B, if every element of A is an element of B. 
A is a subset of B is represented as A C B. 


EXAMPLE 11: Let A = {1, 2, 3} and B = (1, 2, 3, 4, 5, 6}, then A is a subset of B 
denoted as A C B. 


Note: If A is a subset of B and B is a subset of A then A = B. Also, if A is a subset 


of B and not equal to B, then A is a proper subset of B, denoted by A C B. 


Power Set 
Let A be a set. Then the set of all subsets of A is called the power set of A and is denoted 
by P(A). 


— 


EXAMPLE 12: Let A = {1, 2, 3}. Then P(A) = {{1}, {2}, (3}, (1, 2}, (2, 3}, (1, 3}, 
(1, 2, 3}, B} 


Note: If A has N elements, then P(A) has 2‘ elements. 


j. Universal Set 


The set U (# ¢) that consists every set under discussion as a subset, is called the 
universal set. 


EXAMPLE 13: 


(i) If A= The set of all computer books in the college library, then U = the set of all 
books in the college library is the universal set. 
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(ii) The set U = {0, 1, 2, 3, 4, 5, 6, 7, 8, 9} is the universal set for:the set 
A = {1,2,3,4}. 


k. Superset 
If A CB, then B is said to be a superset of A and is denoted by B > A. 


EXAMPLE 14: Let A = {1, 2, 3}, B = {1, 2, 3, 4, 5, 6} then A C B, hence B DA. 


B.1.4 Set Operations 


The operations than can be performed on sets are: 


a. Union 
If A and B are two sets, then the union of A and B is the set that contains all the elements 
of A and B, including the ones which occur both in A and B. It is denoted by A UB. 
Mathematically, A U B = {x|(x € A) Vv (x € B)} 


EXAMPLE 15: If A = {1, 2, 3} and B = {3, 4, 5} then A UB = {1, 2, 3,4, 5}. 


b. Difference 
If A and B are two sets, then the difference of A from B is the set that consists of the 
elements of A that are not in B. It is denoted by A — B. 
Mathematically, A — B = {x|x € A and x ¢ B} 


EXAMPLE 16: If A = {1, 2, 3} B = {3, 4,5} thenA — B= {1, 2}. 
Note that in general A — B 4B —A. 
For, in the above example, B — A = {4,5}. 


c. Intersection 
If A and B are two sets, then the intersection of A and B is the set that consists of the 
elements in both A and B. It is denoted by AN B. 
Mathematically, AN B = {x|(x € A) A (x € B)}. 


EXAMPLE 17: If A = {1, 2, 3, 8} B = (3, 4, 5, 8} then AN B= {3,8}. 


d. Complement 
If A is a set, then the complement of A is the set consisting of all elements of the 
universal set that are not in A. It is denoted by A’ or A. Thus 
A’ = {xx € U A x ¢€ A}, where ¢ means ‘is not an element of’. 


EXAMPLE 18: If U = {0, 1, 2, 3, 4,5, 6, 7, 8, 9, 10} and A = {1, 3, 5, 7, 9}, then A’ = 
{0, 2, 4, 6, 8, 10}. 
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B.1.5 Disjoint sets 
A and B are said to be disjoint if they have no elements in common, 
ie.A NM B=@, where @ is the empty set. 

EXAMPLE 19: A = (1, 2, 3, 4, 5} and B = {6, 8, 9}, then AM B =@. Hence, A and B are 
disjoint. 
B.1.6 Set Identities 
A, B, C represent arbitrary sets, @ is the empty set and U is the universal set. 

a. The Commutative laws: 


AUB=BUA 
ANB=BNA 


b. The Associative laws: 
AU(BUC) =(AUB)UC 
AN(BNC)=(ANBNC 
c. The Distributive laws: 
AU(BNC) =(AUB)N(AUC) 
AN(BUC) = (ANB) U(ANC) 
d. The Jdempotent laws: 
AUA=A 
ANA=A 
e. The Absorption laws: 
AU(ANB)=A 
AN(AUB)=A 
f. The De Morgan laws: 
(AUB) =A'NB 
(ANB) =A’ UB’ 
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g. Other laws involving complements: 
Double Complementation law: (A’)’ =A 


ANA’=@ 


Inverse laws: | AUA' =U 


h. Other laws involving the empty set: 
AU@=A 
AN®=@ 
i. Other laws involving the universal set: 
AUU=U 
ANU=A 


j. General laws: 
Gi) IfACB,thnANB=A 
(ii) IfA CB, thenAUB=B 
(iii) If A C B, then B’ C A’ 
(iv) A~B=ANB’ 
(v) ABB=AUB-ANB 


B.1.7 Ordered Pairs 


Let X be the first set, Y the second set. Then the pair (x, y), where x € X andy € Y is called 
an ordered pair. The ordered pair (x, y) is different from the ordered pair (y, x). 


B.1.8 Cartesian product of sets 


Let A and B be two sets. Then the set of all ordered pairs (a, b), where a € A and b € B, is 
called the cartesian product of A and B and is denoted by A x B, 
ie., A x B = {(a,b)|a € A and b € B}. Similarly B x A = {(b,a)|b € Banda € A}. 


EXAMPLE 20: Let A = {1,2, 3}, B = {-1,0}. 


Ax B= {(1, -1), (2, —1), (3, —1), (1, 0), (2, 0), (3, 0)} 
BxA= {(-1, 1), (1,2), (—1, 3), (, 1), (0, 2), (0, 3)} 


Note: If A has m elemens and B has n elements, then A x B has mn ordered pairs. 
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B.1.9 Venn Diagrams 


The set operations can be displayed through Venn Diagrams. It is a very good tool to get a 
general idea. Note that, Venn Diagrams must NOT be used for rigorous discussions because 
they can represent only very limited situations and miss many other possibilities. 


6) (6) (6) 
(+f 
Gg ay, 
v, 


aAcyU b)BCA c) ANB 


B.2 Relations 


Definition: Let A and B be two sets. A binary relation R from A to B is a subset of the 
cartesian product A x B. Thus, R CA x B. 


| D> 


EXAMPLE 1: Let’s assume that a person owns three shirts and two pairs of slacks. More 
precisely, let A = {blue shirt, red shirt, mint green shirt} and B = {gray slacks, tan slacks}. 
Then certainly A x B is the set of all possible combinations (six) of shirts and slacks that the 
individual can wear. However, the individual may wish to restrict himself to combinations 
which are colour coordinated, or ‘related’. This may not be possible for all pairs in A x B 
but will certainly be a subset of A x B. For example, one such subset may be {(blue shirt, 
gray slack), (red shirt, tan slacks), (mint green shirt, tan slacks)}. 


EXAMPLE 2: Let A = {1,2,3},B = {—1,0}. Then, A x B = {(1,—1), (2,-—1), G,-}), 
(1, 0), (2, 0), (3, 0)}. 
Take a subset R of A x Bas R = {(1, —1), (1,0), (3, 0)}, then R is a relation. 
We say that 1 is related to -1,i.e.,1R1 or (1,—1) €R. 
1 is related to 0, i.e., 1RO or (1,0) € R. 
3 is related to 0, i.e., 3RO or (3,0) € R. 


B.2.1 Composition 


Let R be a relation from a set A to a set B, and S be a relation from the set B to set C. The 
composition of R and S, written as RoS or RS, is the set of pairs of the form (a,c) € A x C, 
where (a,c) € RS, if and only if, there exists b € B such that (a,b) € R and (b,c) € S. 
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B.2.2 Properties of Relations 


Assume that R is a relation on set A. In other words, R C A x A. Let us write a R b to denote 
(a,b) € R. Then we have the following properties: 


Reflexive : R is reflexive if for every a,a € A, aRa. 

Symmetric : R is symmetric if for every a, b € A, such that aRb, then bRa. 

Transitive : R is transitive if for every a, b and c € A such that aRb and bRc, then aRc. 
Equivalence : R is an equivalence relation on A, if R is reflexive, symmetric and 
transitive. 


A function is something that associates each element of a set with an element of another set 
(which may or may not be the same as the first set). 


ao oP 


Definition: A function, denoted by f, from a set A to a set B is a relation from A to B 
(subset of A x B) that satisfies 


a. for each element a in A, there is an element b in B such that < a,b > is in the relation 
(all elements of A appears as the first elements in f), and 

b. if < a,b > and < a,c > are in the relation, then b = c (no two distinct ordered pairs 
of f have the same first element). 


The set A in the above definition is called the domain of the function and B is known as 
its co-domain. 


Thus, f is a function if it covers the domain (maps every element of the domain) and it 
is single-valued. 

The relation given by f between a and b, represented by the ordered pair < a,b >, is 
denoted as f (a) = b, and b is called the image of a and /. The set of images of the elements 
of a set S under a function f is called the image of the set S under f, and is denoted by f(S), 
Le., 

F(S) = {@la € S}, where S is a subset of the domain A of f. 


The image of the domain under f is called the range of f. 
EXAMPLE 1: Let A = {1, 2,3}, B = {-1, 0}. 
Then A x B= {q, =I), (2, —1), G; =I), di, 0), (2, 0), (3, 0)}. 


Choose the set of order pairs f = {(1, —1), (3, —1), (2, 0)}. This is shown in the diagram 
below. 
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EXAMPLE 2: A = {1,2,3,4},B = {a,b,c}. 
Let f ={(,a), (2,a), (3,c), (4,5)}. 


\ 7 
= 


= 


A is the domain of f. 
B is the co-domain of f. 
{a, b,c} = B is the range of f. 


EXAMPLE 3: Let be the function from the set of natural numbers N to N that maps each 


natural number x to x(f (x) = x2) . Then the domain and co-domain of this f are N. The 
image of, say 3, under this function is 9 and its range is the set of squares, i.e., (0, 1,4, 9, 16). 


B.3.1 Sum and product 


Let f and g be two functions from a set A to the set of real numbers R. Then the sum and 
the product of f and g are defined as follows: 


For all x, (f + g)(x) = f() + g(x), and for all x, 
(f * 2)(x) = f(x) * g(x), 
where f (x) x g(x) is the product of two real numbers f(x) and g(x). 
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EXAMPLE 4: Let f(x) =3x+1 and g(x) =x*. Then, ( +g)(x) =x* +3x+1 and 
(f * 2)(x) = 3x3 +27. 


B.3.2 One-to-one 


A function f is said to be one-to-one (injective), if and only if whenever 


f@) =f) > x=y. 


EXAMPLE 5: The function f(x) = x? from the set of natural numbers N to N is a one-to-one 


function. Note that f (x) = x? is not one-to-one if it is from the set of integers (negative as 
well as non-negative) to N, because for example f(1) = f(—1) = 1. 


B.3.3 Onto 


A function f from a set A to a set B is said to be onto(surjective), if and only if, for every 
element y of B, there is an element x in A such that f(x) = y, ie., f is onto if and only if 
f(A) =B. 


EXAMPLE 6: The function f(x) = 2x from the set of natural numbers N to the set of non- 
negative even numbers E is an onto function. However, f(x) = 2x from the set of natural 
numbers N to N is not onto, because (for example) nothing in N can be mapped to 3 by this 
function. 


B.3.4 Bijection 


A function is called a bijection, if it is onto and one-to-one. 


EXAMPLE 7: The function f(x) = 2x from the set of natural numbers N to the set of 
non-negative even numbers E is one-to-one and onto. Thus, it is a bijection. 


Every bijection has a function called the inverse function. 


These concepts are illustrated in the figures below. In each figure below, the points on the 
left are in the domain, the ones on the right are in the co-domain and arrows show <x, f (x)> 


relation. 
e e >< e 9 td 
ee ® e ae e e 
——. ee e e @ 
A function A one-to-one function A onto function A bijection 


(Not onto) (Not one-to-one) 
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B.3.5 Inverse 


Let f be a bijection from a set A to a set B. Then, the function g is called the inverse function 
of f and it is denoted by f7!, if for every element y of B, g(y) = x, where f(x) = y. Note 
that such an x is unique for each y because f is a bijection. 

For example, the rightmost function in the above figure is a bijection and its inverse is 
obtained by reversing the direction of each arrow. 


EXAMPLE 8: The inverse function of f(x) = 2x from the set of natural numbers N to the 
set of non-negative even numbers E is f~!(x) = 1/2x from E to N. It is also a bijection. 
A function is a relation. Therefore, one can also talk about composition of functions. 


B.3.6 Composite function 


Let g be a function from a set A to a set B and let f be a function from B to set C. Then the 
composite of functions f and g, denoted by fg, is the function from A to C that satisfies 
Fe(x) = f(g(x)) for all x in A. 


EXAMPLE 9: Let f(x) = x”, and g(x) = x + 1. Then, f(g(x)) = (x + 1)”. 


B.4 Mathematical Induction 


Mathematical induction is a powerful, straight-forward method of proving statements 
whose ‘domain’ is a subset of the set of integers. Usually, a statement that is proven by 
induction is based on the set of natural numbers. This statement can often be thought of as 
a function of a number n, where n = 1, 2, 3... 

Proof by induction involves three main steps: 


@ proving the base of induction 
mg forming the induction hypothesis and 
@ finally proving that the induction hypothesis holds true for all numbers in the domain. 


Proving the base of induction involves showing that the claim holds true for some base value 
(usually 0,1 or 2). 


EXAMPLE Show that n? > 2n for all n = 2,3.... 


In the base step, choose the smallest value of n that you can easily work with. In this 
case, we choose n = 2. 
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Let n = 2. Is n* > 2n? 
Left Hand Side(L.H.S): n? = 2? =4 
Right Hand Side(R.H.S): 2n = 2(2) = 4 


L.H.S = R.H.S, so therefore the statement n? > 2n is true for n = 2. 

This is the base step. We simply prove that the statement holds true for at least one value 
of n. 

In the induction hypothesis step, ve say that since the statement holds true for atleast 
one value, we can assume that it will hold true for some arbitrary, fixed value of n, usually 
k. We can make this assumption, since we know that it will atleast be true for our base value 
(2 in our example). 


EXAMPLE (continued): Assume n* > 2n for some n = k. 
Therefore, k2 > 2k. 


This is the simplest and most powerful part of proof by induction. By forming the 
induction hypothesis, we can now use it to complete the proof. 

In the final step, we prove that the induction hypothesis holds true for all values of n. To 
do this, we use the fact that the statement is true for n = k, and then check to see if it holds 
true forn =k +1. 


EXAMPLE (continued): Is (k + 1)? > 2(k + 1)? 


LHS: (+1)? =k +2k+1 
RHS: 2(k+1)=2k+2 


But by the induction hypothesis, k* > 2k. Therefore, 
LHS: 1 +2k+1>2k+2k+1. 


But 2k > 1 fork > 1. So, 
2k+2k+1>2k+2. 


Therefore, L.H.S > R.H.S. So (k + 1)* > 2k4+ 0D. 

Using this fact, we can now state that n? > 2n for all n = 2,3... At first it may seem too 
simple, but if you examine this proof, it is easy to see the logic behind it. We know that the 
statement is true for n = 2. So we can now assume that it is true up to some fixed number 
k, which is atleast 2. By proving that it is true for k + 1, we now know that it is true for at 
least n = 3. So now k = 3, but since the statement is true for k + 1,n is atleast 4. In this 


manner, we can repeat this pattern indefinitely. Therefore, the statement is proven true for 
all n. 
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While different statements can require different techniques to prove that the statement is 
true (in the base step and k + 1 step), the reasoning behind a proof by induction is always 
the same. The following are the three steps in induction. 


@ Prove the statement irue for some small base value (usually 0,1, or 2). 
@ Form the induction hypothesis by assuming the statement is true up to some fixed value 


n=k. 
@ Prove that the induction hypothesis holds true for n = k + 1. 
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Appendix C: System Modelling 


C.1 Thermostat 


Consider the problem of heating a room. Assume that the thermostat is used as a controller 
and we do not have the exact model of how the the thermostat functions. It is only known 
that the thermostat turns on the radiator when temperature is between 68 and 70 degrees 
and it turns off the radiator when temperature is between 80 and 82 degrees. This heating 
system can be modelled as an automaton shown in the figure below, where X denotes the 
temperature. 


X <=70 


X=-—X+ 100 
X < 80 


X >=80 


Figure 1. Modeling thermostat & heating of a room 


The automaton shown is a non deterministic automaton, in the sense that for a given initial 
condition it accepts the whole family of different executions (solutions). 


C.2 Elevator Controller 


Consider an elevator that serves two floors. Inputs to elevator are calls to a floor, either from 
inside the elevator or from the floor itself. 


The following are the 3 possible inputs: 
i) no calls (a) 
ii) call to floor one (b) 
iii) call to floor two. (c) 


The following are the 6 possible states: 


qi — waiting on first floor 
q2 — about to go up 
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q3 — going up 

q4 — going down 

q5 — Waiting on second floor 
46 — about to go down. 


Transition diagram for elevator: 


a,c 


b 
a,c b a,b 
a,b 
a,c 
c 
c 


a,b b 


Figure 2. Transition Diagram 
Transition table for the elevator: 


Present Input 
STATES b 
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Table C.1 Transition table 
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The above transition diagram is without the accepting and rejecting states because the 
elevator design is simple and acceptance is not an issue. In designing a more complex 
elevator, with states like overloading, breakdown etc., accepting and rejecting states become 
significant. 


C.3 Bouncing Ball 


With respect to a bouncing ball, the vertical position of the ball is denoted by x; and the 
velocity by x2. The acceleration due to gravity is denoted by g and c is the coefficient 
of restitution (c can have any value between 0 and 1). Figure-3 shows the modelling of 
bouncing ball. Figure-4 shows the evolution of continuous state of system (x;, x2). As long 
as the ball is above the ground [(x; > 0)], the continuous state flows according to the 
differential equation specified in the single discrete state of automaton. When the transition 
state is fulfilled, a discrete jump takes place. This corresponds to the bouncing of the ball. 
The speed of the ball is reset at each discrete jump (x2 = —cx2), which models the physics 
of bouncing ball, as the ball loses a proportional amount of its energy at each bounce. As is 
indicated by simulation in figure 4, this model leads to the result that ball bounces infinitely 
many times in a finite time interval. 


x, :=—Cx, 


Figure 3. Modeling of bouncing ball 
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Figure 4. States of system (x;, x2) 
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